Abstract
The growing application of carbon dioxide (CO2) in various environmental and energy fields, including carbon capture and storage (CCS) and several CO2-based enhanced oil recovery (EOR) techniques, highlights the importance of studying the phase equilibria of this gas with water. Therefore, accurate prediction of CO2 solubility in water becomes an important thermodynamic property. This study focused on developing two powerful intelligent models, namely gradient boosting (GBoost) and light gradient boosting machine (LightGBM) that predict CO2 solubility in water with high accuracy. The results revealed the outperformance of the GBoost model with root mean square error (RMSE) and determination coefficient (R2) of 0.137 mol/kg and 0.9976, respectively. The trend analysis demonstrated that the developed models were highly capable of detecting the physical trend of CO2 solubility in water across various pressure and temperature ranges. Moreover, the Leverage technique was employed to identify suspected data points as well as the applicability domain of the proposed models. The results showed that less than 5% of the data points were detected as outliers representing the large applicability domain of intelligent models. The outcome of this research provided insight into the potential of intelligent models in predicting solubility of CO2 in pure water.
Similar content being viewed by others
Explore related subjects
Find the latest articles, discoveries, and news in related topics.Introduction
Phase equilibria calculations of carbon dioxide (CO2) solubility in water holds significant importance in a variety of research areas due to the rising worldwide application of this gas for a broad range of purposes. CO2 is one of the greenhouse gases that cause major ecological issues, including global warming and climate change effects. Over the last few years, carbon capture and utilization (CCU) and carbon capture and storage (CCS) techniques have attracted significant interest from researchers as effective greenhouse gas mitigating strategies1,2. Underground geological formations, such as depleted oil and gas reservoirs, deep saline aquifers, and salt domes have been identified as reliable sites for the CCS process3. Depleted gas and oil reservoirs have been widely regarded as a highly promising choice, primarily due to their large storage capacity and the vast number of available data attained from exploration and production operations4,5,6. The oil industry also takes the advantage of CO2 injection into oil and gas reservoirs as a promising enhanced oil recovery (EOR) method7,8. Moreover, the combination of water and CO2 injection, which is known as water-alternating-gas (WAG), is considered a successful technique because of its ability to control gas mobility and reduce the viscous fingering phenomenon9,10. As seen in many fields, the interactions between water and CO2 play an important role and the dissolution of CO2 could affect the ultimate storage capacity in CCS11,12 and the performance of EOR methods. From an environmental perspective, it could also pose a leakage risk by contaminating underground water13,14,15,16.
Accordingly, CO2 dissolution in water has been intensively investigated using a variety of experimental techniques17,18,19,20. One of the earliest studies that examined the solubility of CO2 in water at sequestration conditions was conducted by Wiebe and Gaddy21 at temperatures between 323 and 373 K and pressures up to almost 71 MPa. According to the literature, the maximum pressure at which the solubility of CO2 gas in water has been experimentally measured is 350 MPa which was reported in the studies of Todheide & Franck22 and Takenouchi and Kennedy23. In 1999, Dhima et al. measured the CO2 solubility in pure water at 344.25 K and pressures up to 100 MPa24. Ahmadi and Chapoy25 utilized a high-pressure setup to examine the CO2 solubility in several salt solutions at temperatures between 300 and 424 K and pressures up to 41 MPa. They conducted experiments with deionized water under equal conditions to verify the validity of their method. Wang et al.26 conducted experiments to determine the solubility of CO2 in water under high pressure and temperature conditions. They investigated the effect of the vapor phase of H2O. A summary of literature experimental research os presented in Table 1, comprising the respective year of study, applied pressure and temperature ranges, and the employed equipment.
CO2 solubility in water can also be estimated by thermodynamic modeling, which is more cost-effective than experimental measurements. In recent years, the potential of theoretical models to effectively represent influential characteristics such as pressure, temperature, and electrolyte concentrations has attracted the interest of researchers attempting to develop models for different systems27. Duan and Sun28 proposed a model to estimate CO2 solubility in water and aqueous NaCl solutions using an extensive databank containing about 1500 data points at temperatures ranging from 273 to 533 K and pressures between 0 and 2000 bar. Spycher et al.29 utilized a calculating approach based on determining new Redlich-Kwong Equation of State (EoS) parameters19 for mutual solubility between CO2 and water, using the CO2 solubility data obtained at 285–373 K and up to 60 MPa. Yan et al.30 measured CO2 solubility in water and NaCl brine in three temperatures of 232, 373, and 413 K and a pressure range of 5 to 40 MPa. By comparing experimental data with Søreide and Whitson EoS model31, they indicated that a modification in the model by refitting the interaction parameter between water and CO2 in the aqueous phase could lead to an acceptable solubility prediction. Recently, statistical associating fluid theory (SAFT) EoSs have also been applied to CO2–electrolyte–water solutions27,32,33. Yan and Chen34 created a model for CO2 solubility in NaCl solution employing a Perturbed-Chain SAFT (PC-SAFT) EoS with a maximum temperature and pressure of 473.15 K and 150 MPa. They determined Henry's constant by fitting the experimental CO2 solubility in pure water at the same pressure and temperature range. Moreover, multiple thermodynamic models of CO2 solubility have been constructed based on various pressures, temperatures, and water salinity conditions25,35,36. Table 2 provides an informative summary of theoretical models found in the literature. It includes information such as the year of study, the pressure and temperature ranges that were examined, the techniques used for prediction, and the error.
In recent years, artificial intelligence (AI) has garnered significant attention from researchers and has become a potent tool for predicting tasks in a variety of fields37,38,39,40,41. AI has been widely applied in the oil industry owing to its advantages over costly and time-consuming experimental procedures and sophisticated thermodynamic models42. AI was utilized in the assessment of a water-alternating-CO2 process37 for determining the minimum miscibility pressure (MMP) of CO238 and predicting oil recovery in CO2-EOR approaches39,40. A considerable number of studies have been carried out in the field of environmental research, specifically in the area of CCS, which have provided valuable insights into the potential of CCS as a feasible approach to mitigate climate change. These studies have focused on predicting the efficiency of CO2 storage41 and assessing the features of coal as an approach to carbon sequestration42,43.
Ghasemian et al.44 developed an artificial neural network (ANN) to estimate the solubility of CO2 in water based on 105 data points from their experiments conducted at pressure and temperature ranges of 0.1–1 MPa and 278.15–348.15 K, respectively. They noticed that the ANN model outperformed the EoSs in CO2 solubility prediction with an absolute average deviation (AAD) of 4%. Hemmati-Sarapardeh et al.45 investigated the efficiency of four machine learning (ML) techniques, namely, radial basis function (RBF), multilayer perceptron (MLP), gene expression programming (GEP), and least-squares support vector machine (LSSVM) models, to estimate CO2 solubility in pure water at high pressures and temperatures. The results showed that the highest level of accuracy was achieved with the LSSVM model optimized by the firefly optimization algorithm (FFA) with a root mean square error (RMSE) of 0.3261. Khoshraftar and Ghaemi utilized ANN and response surface methodology (RSM) to predict the solubility of CO2 in water. Their model development was based on 240 measurements that were conducted within a pressure range of 0.5–200 MPa and a temperature between 313.15 and 473.15 K. The efficiency of MLP and RBF models were compared using an ANN technique. All the developed models accurately predicted solubility, although the RBF and MLP models exhibited the highest performance46.
Jeon and Lee47 developed an ANN based on 2406 data points to predict CO2 solubility in pure water and single-salt aqueous solutions. In order to train the model, 80% of the data bank was used, and the remaining 20% of the data points of solubility in the complex or multi-component solutions were utilized for validation and testing. They stated that the developed ANN model could be extrapolated to predict CO2 solubility in multi-component salt solutions. Over the past years, several investigations have been conducted on the prediction of pure and impure CO2 solubility48 in brine solutions containing different components with the aid of AI49,50,51. Furthermore, research has been carried out to predict the solubility of CO2 in the non-aqueous phase. Certain EOR techniques employ the injection of miscible CO2 into the oil in order to enhance oil mobility by lowering the viscosity of the oil, the IFT, and oil swelling. Rostami et al. developed a gene expression programming (GEP) model for predicting the solubility of CO2 in both dead and live oil, utilizing 106 and 74 data points, respectively. The error analysis revealed that the QEP-based model predicts the solubility of CO2 in both dead oil and living oil accurately, with correlation coefficients (R2) of 0.9860 and 0.9844 for each52. Prior studies that have explored the application of AI to predict the solubility of gases in predicting the solubility of gases in different types of aqueous solutions and non-aqueous phase are summarized in Table 3. This table presents data regarding the year of the study, the applied AI technique, the kind of gas, the solution type, and the temperature and pressure ranges.
The purpose of this study is to develop intelligent models that accurately predict CO2 solubility in pure water. For this, a large data bank with 785 data points containing the values of pressure, temperature, and CO2 solubility in water is gathered. Then, two powerful intelligent models, namely gradient boosting (GBoost) and light gradient boosting machine (LightGBM) are implemented to provide predictions for the CO2 solubility as a function of temperature and pressure. Various statistical and graphical methods are employed to assess the validity and precision of the developed intelligent models. In addition, a trend analysis is undertaken to verify the developed models’ ability to detect physical trends. Lastly, the validity of the data bank and application domain of both models is examined by the Leverage method. As depicted in Table 3, although there are valuable artificial models for forecasting CO2 solubility in aqueous solutions, blending data points of CO2 solubility in pure water, single salt solutions, and diverse brines have been implemented in most predicted models. This study comprises an extensive database covering a broad range of temperature and pressures with reference to CO2 solubility in water. This vast data collection resulted in a well-trained AI model with high accuracy. Both development models offered in this study demonstrate a high level of accuracy with a minimum determination coefficient of 0.995, indicating the reliability of the models. When employing EOR techniques or carbon storage technologies, we encounter a complex scenario involving the combination of gases in a water solution with varying salt levels and components. These validated and precise AI models can be employed in future research to assess complex systems, such as gas mixes and aqueous solutions of varying salinity levels. This not only improves the reliability of these models but also creates new opportunities for enhancing the effectiveness of these operations.
Methodology
Data gathering
In order to develop intelligent models, a trustworthy and broad data set was collected from various sources17,19,20,22,24,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87. Each of the 785 data points in the database contained values for temperature, pressure, and CO2 solubility in pure water. Temperature, pressure were regarded as the models’ inputs, while solubility was specified as the models’ output parameter. These data points have already been used by Hemmati-Sarapardeh et al.45. The data set was randomly partitioned into 80% and 20% subsets for model training and testing, respectively. Table 4 describes the statistical features of the data bank.
Modeling techniques
Gradient boosting (GBoost)
Gradient boosting (GB) is a kind of ensemble supervised tree-based machine learning (ML) approaches that can be utilized for both regression and classification issues88,89,90. It is called an ensemble because the ultimate model's prediction is produced based on various single models’ (decision trees) predictions88. GB, which is portrayed in Fig. 1, is an iterative accumulation of sequentially organized tree-based models of weak learners or predictors that are converted to powerful learners91,92. Commonly, boosting techniques combine weak predictors into a powerful one in an iterative path to minimize the loss function89. This loss function is minimized similarly to an ANN in which weights are tuned93.
To achieve this purpose, it is recommended to choose a function \(h(x,{\theta }_{t})\) to be the most parallel to the negative gradient \({({g}_{t}\left({x}_{i}\right))}_{i=1}^{N}\). By selecting an iterative approach, we can defeat challenge posed by the prediction variables. The function \({g}_{t}\left(x\right)\) for every experimental data is calculated as below94:
To permit the replacement of a hard optimizing problem, one can easily choose the new function increment to be the most matched with \({-g}_{t}\left(x\right)\) utilizing classic least-squares optimization as follows89:
The following stages show a general optimizing procedure of the GBoost94:***
Initializing the \({\widehat{f}}_{0}\) as a constant;
Calculating the negative gradient of \({-g}_{t}\left(x\right)\);
Conforming a next base-learner function \(h(x,{\theta }_{t})\);
Recognizing the optimal gradient descent step-size \({\rho }_{t}\) as:
Updating the model's prediction:
This approach involves the base-learner phase, which consists of a single neuron and the loss function employed is the standard squared error. Through the process of training of the model, the optimum structure is earned.
Light gradient boosting machine (LightGBM)
LightGBM is a gradient-supervised technique based on decision trees and the idea of boosting algorithms95. LightGBM technique, which includes several decision trees, is applicable in various ML tasks like regression, classification, and ranking96,97,98. Each LightGBM technique employs a powerful learning framework to produce prediction values99. Its principal differences from other tree-based models are that it accelerates the training stage by applying histogram-based techniques, decrease memory consumption and it uses a leaf-wise growth strategy with depth constraints95. Figure 2 illutrates a schematic image of the LightGBM. The training process of the LightGBM is determined by the subsequent formula:
Next, \({\widehat{f}}_{(x)}\) will forecast by minimizing the loss function \(L\) as95:
Eventually, the training stage of every regression tree can be indicated as \({W}_{q(x)} , q \in \left\{\text{1,2},3,\dots ,N\right\}\); where W denotes a weight term of every leaf node, q shows utilized decision rules in each tree, and N indicates the number of leaves in a tree100. Thus, by the employing of Newton's method for recognizing objective function, the training outcome of every stage is tuned by the following equation:
Results and discussion
Model's development
This research focused on proposing two intelligent models that accurately predict the solubility of CO2 in pure water. The models were designed to predict the target variable as a function of temperature and pressure. The established models were trained with 628 data points (80% of the data bank) and tested with 157 data points (20% of the data bank). The accuracy of the intelligent models in this study compared to the EoSs is impressive. All statistical and graphical error analyses verify this issue. Furthermore, smart techniques require less input parameter information than EoSs. In our current research, we only utilized pressure and temperature to predict the solubility of CO2 in pure water. In contrast, many EoSs necessitate additional properties. Moreover, the application of EoSs typically consumes a significant amount of time.
For the tuning process, various ranges of hyperparameters were tested to find the optimal value of each hyperparameter. Table 5 shows the optimum values of the GBoost and LightGBM hyperparameters, separately. Max depth refers a maximum depth of the tree. Learning rate determines the step size at each sequential iteration. Min sample split and Min sample leaf show the minimum number of samples needed for splitting an internal node and to form a leaf node, respectively. Also, N estimators represents the number of trees in the forest.
Performance evaluation
To analyze the performance of the developed models, multiple statistical and graphical evaluations were utilized. This research implemented the root mean square error (RMSE) and the correlation coefficient (R2) for statistical evaluation. Equations (8) and (9) provide the mathematical formulations of these criteria.
where \({y}_{i}^{exp}\) is the experimentally determined solubility of CO2 in pure water and \({y}_{i}^{cal}\) is the value of solubility predicted by the models. Also, y represents the mean value of the measured data points. As Table 6 reflects, the two developed models showed strong agreement with experimental data. Although error indices indicate the great accuracy of both models, the GBoost model outperformed the LightGBM algorithm slightly.
Graphical assessments were also used to compare the reliability of the models. For this purpose, various kinds of plots, comprising cross-plots, error distribution plots, and cumulative frequency were plotted. Cross plots are one of the visual assessments that compare the predicted data with experimental measurements. The concentration of data near the Y = X line indicates the accuracy of the developed model. Figure 3 shows the cross plots of the GBoost and LightGBM models. As shown, the dense compactness of train and test data points was seen around the unit slope line. While the points were more distributed close to the Y = X line in the cross plot of the LightGBM model, the GBoost approach demonstrated a stronger concentration around this line.
Figure 4 demonstrates the error distribution diagrams of the established models, where the error is defined as the deviation of predicted data of the solubility from the experimental values. In this diagram, more aggregation around the zero line implies a more accurate model. The plots revealed that the majority of points were located close to the zero-error line and confirmed the accurate predictions. While the GBoost model has an accurate performance over the CO2 solubility range, a slight deviation was observed in the LightGBM model in a few points, especially in high solubility values, indicating that there was a minor relative error at these values.
Group error analysis
Group error diagrams were used to assess the efficiency of the established models in various pressure and temperature ranges. Figure 5 depicts group error plots of the GBoost and LightGBM models in five equal temperature and pressure ranges. As Fig. 5a shows, both models demonstrated higher precision at temperatures below 454.13 K. It should be noted that the GBoost model demonstrated higher reliability than LightGBM throughout all temperature ranges. The effect of pressure on the efficiency of the model was shown in Fig. 5b. As shown, both models provided more accurate predictions at pressures lower than 700 bar. While GBoost demonstrated relatively consistent accuracy at all ranges, increasing the pressure reduced the efficiency of LightGBM.
Cumulative frequency
The cumulative frequency plot is an interactive visualization technique for evaluating the reliability of intelligent models. In these charts, the higher a model is positioned, the more accurate its predictions are. The cumulative frequency curves of the GBoost and LightGBM are illustrated in Fig. 6. The closeness of the two models to the vertical axis signifies that these models predicted the majority of data points with a low error. Considering that the error is defined as the absolute difference between measured experimental and predicted data points, 95% of data predicted by the GBoost model have an error of less than 0.29 mol/kg. By contrast, LightGBM reported 93% of the data at this error value. Both developed models demonstrated an acceptable level of certainty in predicting the solubility of CO2, whilst GBoost showed relatively higher accuracy.
Sensitivity analysis
One of the sensitivity approaches employed to measure the effect of input parameters on the output is the relevancy factor. This factor is employed to determine the correlation between dependent and independent variables, and is defined below101:
where \(I\) and \(O\) symbolize the the input and output variables, respectively. As stated earlier, temperature and pressure were considered input parameters of the developed models for predicting CO2 solubility in water. \({I}_{i}^{k}\) represents the \(i\) th value of \(k\)th input parameters, \({I}_{ave}^{k}\) stands for the average value of the \(k\)th input, and \({O}_{ave}\) indicates the average output. The relevancy factor shows a value between − 1 and 1. Where a positive number implies a direct correlation between the input and output variables, while a negative value shows a reverse one. The degree of influence is expressed by the absolute value of \(r\) so that the greater the r, the more influence of the input. Figure 7 demonstrates the impact of pressure and temperature on CO2 solubility for the GBoost and LightGBM models. As can be seen, the two models reflected relatively the same relevancy factor of input parameters. While a rise of both temperature and pressure would result in an increase in solubility, pressure with a relevancy factor of 0.87 had a stronger impact than temperature with a relevancy factor of almost 0.65.
Model trend analysis
As noted in previous sections, the GBoost approach provided more accurate CO2 solubility predictions. To investigate the impact of pressure and temperature on the CO2 solubility in pure water and to verify the model's engagement to a physically expected trend, the experimentally measured points were compared to the GBoost model’s predictions with respect to the trends of solubility at various temperatures and pressures by keeping one of them at a constant value. The temperature's impact on solubility at three constant pressures is depicted in Fig. 8. As demonstrated, the predicted points were accurately consistent with the experimental data and illustrated that increasing temperatures increases solubility strength. The solubility line at 3500 bar is positioned above the solubility lines at 2000 bar and 1200 bar in the figure, which illustrates the influence of pressure. In addition, the impact of pressure on solubility is represented in Fig. 9. As illustrated in the figure, a rise in pressure enhanced gas solubility at both high and low temperatures. Pressure rise causes compressibility in the gas phase, which leads to the release of more space for additional gas molecules to be dissolved in the water. In Fig. 9a, the solubility at 278 K was higher than that of 304.19 K. Although, in high-pressure regions, the solubility increased as the temperature rise. In other words, at low and medium pressures, heating the liquid or increasing the temperature results in a rise of the molecules’ kinetic energy, leading them to move faster and letting more of them escape the solvent. This indicates that thermal energy overcomes the intermolecular attraction force between carbon dioxide and pure water. However, when gas reaches the supercritical condition, which for CO2 is above 304.13 K and 73.97 \(bar\) of temperature and pressure, it exhibits a more complicated behavior. As seen in Figs. 8 and 9b demonstrate, CO2 tends to behave more as a liquid, showing that both pressure and temperature positively affect solubility102. In conclusion, Figs. 8 and 9 confirmed the validity of the GBoost model and revealed that the model accurately captured the phenomenon’s existing physical trend.
Implementation of the Leverage method
Existing outlier data in a data bank might adversely impact the prediction’s efficiency and applicability of a model. Therefore, the detection of these measured points that differ from the bulk of data is a key step for model development. The Leverage method was employed to detect outliers in this study103,104. According to a statistical perspective and a visual evaluation, this technique identifies outlier points. This method sketches the Williams plot based on a \(H\) value and standardized residual. The \(H\) value, refers to the elements of the Hat matrix, which is computed as below:
where \(k\) and \(N\) indicate the input parameters and number of data points, respectively, and \(T\) refers to the transpose. The standardized residual is measured based on the \({e}_{i}\), which is the deviation between predicted and experimental data, root mean square error (\(RMSE\)), and \(H\) vectors. In the Williams plot, valid data points are situated in the area bounded by the Leverage limit (\({H}^{*}\)), and the suspected limits. This valid zone represents the applicability of the model domain. The mathematical measurement of the Leverage limit is presented in Eq. (13), which corresponds to 0.0115 for the gathered databank. Suspicious limits are defined as a standard residual higher than 3 or less than − 3. In addition, the area with \(H>{H}^{*}\) is classified into two areas based on its SR: good high leverage and bad high leverage. The good high leverage, which represented data points that predicted well but were outside the scope of the model's applicability, relates to the data as \(-3\le SR\le 3\). The bad high leverage area belongs to points with \(SR>3\) or \(SR<-3\)105,106,107.
where k represents the number of inputs (here two) and N is the total number of data points (here 785).
Williams's plot is one of the outlier detection methods for examining the performance of developed models which is widely used in AI application studies45,108,109,110,111. The William plots of the proposed models for predicting the CO2 solubility in water are shown in Fig. 10. As the plots demonstrate, the majority of data points were placed in the valid region, 95.92% of the points for GBoost and 95.67% of them for the LightGBM. For both models, only 2.42% of data exceeded the leverage limit. Overall, as a negligible percentage of data were placed out of the valid zone, the Leverage approach approved the applicability of proposed models and the validity of experimental data points.
Conclusions
In this study, two tree-based models, GBoost and LightGBM, were developed to predict CO2 solubility in pure water based on an extensive data bank including 785 experimental data points collected from diverse sources. Two parameters of pressure and temperature were considered as the input parameters, while solubility was defined as the output.
-
In order to validate models, statistical and graphical evaluations were implemented. Multiple validation procedures approved the high precision agreement between experimental and predicted solubility values. The findings indicated the outperformance of the GBoost model with R2 and RMSE values of 0.9976 and 0.137 mol/kg, respectively.
-
The trend analysis was employed to assess the pressure and temperature effects on the solubility in comparison to the predictions of the model. The trend analysis revealed that the proposed models exhibited the high accuracy in comprehending understanding the physical trend of the problem.
-
For outlier detection, the Leverage approach was implemented which demonstrated the validity and reliability of the models on a large portion of data; nevertheless, only a few points were identified as suspected data.
-
The findings of this study demonstrated that both developed models could be considered potent and trustworthy tools for predicting the solubility of CO2 in water.
Data availability
All the data have been collected from literature. We cited all the references of the data in the manuscript. However, the data will be available from the corresponding author on reasonable request.
References
Xu, T., Apps, J. A. & Pruess, K. Mineral sequestration of carbon dioxide in a sandstone–shale system. Chem. Geol. 217(3), 295–318. https://doi.org/10.1016/j.chemgeo.2004.12.015 (2005).
Cuéllar-Franca, R. M. & Azapagic, A. Carbon capture, storage and utilisation technologies: A critical analysis and comparison of their life cycle environmental impacts. J. CO2 Utilization. 9, 82–102. https://doi.org/10.1016/j.jcou.2014.12.001 (2015).
S. Bachu & S. Bachu. Screening and ranking of hydrocarbon reservoirs for CO2 storage in the Alberta Basin, Canada (2001).
Raza, A. et al. Preliminary assessment of CO2 injectivity in carbonate storage sites. Petroleum 3(1), 144–154. https://doi.org/10.1016/j.petlm.2016.11.008 (2017).
Raza, A. et al. Well selection in depleted oil and gas fields for a safe CO2 storage practice: A case study from Malaysia. Petroleum 3(1), 167–177. https://doi.org/10.1016/j.petlm.2016.10.003 (2017).
Underschultz, J. et al. CO2 storage in a depleted gas field: An overview of the CO2CRC Otway Project and initial results. Int. J. Greenhouse Gas Control 5(4), 922–932. https://doi.org/10.1016/j.ijggc.2011.02.009 (2011).
L. Lake, R. T. Johns, W. R. Rossen, & G. A. Pope. Fundamentals of Enhanced Oil Recovery. Society of Petroleum Engineers.
L. Lake, M. Lotfollahi, & S. Bryant. Fifty years of field observations: Lessons for CO2 storage from CO2 enhanced oil recovery. in 14th Greenhouse Gas Control Technologies Conference Melbourne, 2018, pp. 21–26.
Aziz, H., Muther, T., Khan, M. J. & Syed, F. I. A review on nanofluid water alternating gas (N-WAG): Application, preparation, mechanism, and challenges. Arab. J. Geosci. 14(14), 1416. https://doi.org/10.1007/s12517-021-07787-9 (2021).
D. Merchant. Enhanced oil recovery—The history of CO2 conventional wag injection techniques developed from lab in the 1950’s to 2017. in Carbon Management Technology Conference, 2017, vol. All Days, CMTC-502866-MS. https://doi.org/10.7122/502866-ms
Li, W., Nan, Y., Zhang, Z., You, Q. & Jin, Z. Hydrophilicity/hydrophobicity driven CO2 solubility in kaolinite nanopores in relation to carbon sequestration. Chem. Eng. J. 398, 125449. https://doi.org/10.1016/j.cej.2020.125449 (2020).
Ho, N. L., Perez-Pellitero, J., Porcheron, F. & Pellenq, R. J. M. Enhanced CO2 solubility in hybrid adsorbents: Optimization of solid support and solvent properties for CO2 capture. J. Phys. Chem. C 116(5), 3600–3607. https://doi.org/10.1021/jp2099625 (2012).
Frye, E., Bao, C., Li, L. & Blumsack, S. Environmental controls of cadmium desorption during CO2 leakage. Environ. Sci. Technol. 46(8), 4388–4395. https://doi.org/10.1021/es3005199 (2012).
Xiao, T. et al. Potential chemical impacts of CO2 leakage on underground source of drinking water assessed by quantitative risk analysis. Int. J. Greenhouse Gas Control 50, 305–316. https://doi.org/10.1016/j.ijggc.2016.04.009 (2016).
Little, M. G. & Jackson, R. B. Potential impacts of leakage from deep CO2 geosequestration on overlying freshwater aquifers. Environ. Sci. Technol. 44(23), 9225–9232. https://doi.org/10.1021/es102235w (2010).
Lu, J., Partin, J. W., Hovorka, S. D. & Wong, C. Potential risks to freshwater resources as a result of leakage from CO2 geological storage: A batch-reaction experiment. Environ. Earth Sci. 60(2), 335–348. https://doi.org/10.1007/s12665-009-0382-0 (2010).
Wiebe, R. & Gaddy, V. L. The solubility of carbon dioxide in water at various temperatures from 12 to 40° and at pressures to 500 atmospheres. Critical phenomena*. J. Am. Chem. Soc. 62(4), 815–817. https://doi.org/10.1021/ja01861a033 (1940).
Nakayama, T., Sagara, H., Arai, K. & Saito, S. High pressure liquid–liquid equilibria for the system of water, ethanol and 1,1-difluoroethane at 323.2 K. Fluid Phase Equilibria 38(1), 109–127. https://doi.org/10.1016/0378-3812(87)90007-0 (1987).
King, M. B., Mubarak, A., Kim, J. D. & Bott, T. R. The mutual solubilities of water with supercritical and liquid carbon dioxides. J. Supercrit. Fluids 5(4), 296–302. https://doi.org/10.1016/0896-8446(92)90021-B (1992).
Teng, H. O. & Yamasaki, A. Pressure-mole fraction phase diagrams for CO2-pure water system under temperatures and pressures corresponding to ocean waters at depth to 3000 M. Chem. Eng. Commun. 189(11), 1485–1497. https://doi.org/10.1080/00986440214993 (2002).
Wiebe, R. & Gaddy, V. L. The solubility in water of carbon dioxide at 50, 75 and 100°, at pressures to 700 atmospheres. J. Am. Chem. Society 61(2), 315–318. https://doi.org/10.1021/ja01871a025 (1939).
K. Tödheide & E. U. Franck. Das Zweiphasengebiet und die kritische Kurve im System Kohlendioxid–Wasser bis zu Drucken von 3500 bar. 37(5_6), 387–401 (1963). https://doi.org/10.1524/zpch.1963.37.5_6.387.
Takenouchi, S. & Kennedy, G. C. The binary system H2O-CO2 at high temperatures and pressures. Am. J. Sci. 262(9), 1055. https://doi.org/10.2475/ajs.262.9.1055 (1964).
Dhima, A., de Hemptinne, J.-C. & Jose, J. Solubility of hydrocarbons and CO2 mixtures in water under high pressure. Ind. Eng. Chem. Res. 38(8), 3144–3161. https://doi.org/10.1021/ie980768g (1999).
Ahmadi, P. & Chapoy, A. CO2 solubility in formation water under sequestration conditions. Fluid Phase Equilibria 463, 80–90. https://doi.org/10.1016/j.fluid.2018.02.002 (2018).
Wang, X. et al. Experiment based modeling of CO2 solubility in H2O at 313.15–473.15 K and 0.5–200 MPa. Appl. Geochem. 130, 105005. https://doi.org/10.1016/j.apgeochem.2021.105005 (2021).
Ramdan, D. et al. Prediction of CO2 solubility in electrolyte solutions using the e-PHSC equation of state. J. Supercrit. Fluids 180, 105454. https://doi.org/10.1016/j.supflu.2021.105454 (2022).
Duan, Z. & Sun, R. An improved model calculating CO2 solubility in pure water and aqueous NaCl solutions from 273 to 533 K and from 0 to 2000 bar. Chem. Geol. 193(3), 257–271. https://doi.org/10.1016/S0009-2541(02)00263-2 (2003).
Spycher, N., Pruess, K. & Ennis-King, J. CO2-H2O mixtures in the geological sequestration of CO2. I. Assessment and calculation of mutual solubilities from 12 to 100°C and up to 600 bar. Geochimica et Cosmochimica Acta 67(16), 3015–3031. https://doi.org/10.1016/S0016-7037(03)00273-4 (2003).
Yan, W., Huang, S. & Stenby, E. H. Measurement and modeling of CO2 solubility in NaCl brine and CO2–saturated NaCl brine density. Int. J. Greenhouse Gas Control 5(6), 1460–1477. https://doi.org/10.1016/j.ijggc.2011.08.004 (2011).
Søreide, I. & Whitson, C. H. Peng-Robinson predictions for hydrocarbons, CO2, N2, and H2S with pure water and NaCI brine. Fluid Phase Equilibria 77, 217–240. https://doi.org/10.1016/0378-3812(92)85105-H (1992).
Rozmus, J., de Hemptinne, J.-C., Galindo, A., Dufal, S. & Mougin, P. Modeling of strong electrolytes with ePPC-SAFT up to high temperatures. Ind. Eng. Chem. Res. 52(29), 9979–9994. https://doi.org/10.1021/ie303527j (2013).
Ji, X., Tan, S. P., Adidharma, H. & Radosz, M. SAFT1-RPM approximation extended to phase equilibria and densities of CO2−H2O and CO2−H2O−NaCl systems. Ind. Eng. Chem. Res. 44(22), 8419–8427. https://doi.org/10.1021/ie050725h (2005).
Yan, Y. & Chen, C.-C. Thermodynamic modeling of CO2 solubility in aqueous solutions of NaCl and Na2SO4. J. Supercrit. Fluids 55(2), 623–634. https://doi.org/10.1016/j.supflu.2010.09.039 (2010).
Haghtalab, A., Asadi, E. & Shahsavari, M. High-pressure vapor-liquid equilibrium measurement of CO2 solubility into aqueous solvents of (diisopropylamine + l-lysine) and (diisopropylamine + piperazine + l-lysine) at different temperatures and compositions. J. Chem. Eng. Data 66(11), 4254–4271. https://doi.org/10.1021/acs.jced.1c00725 (2021).
Liu, Z., Cui, P., Cui, X., Wang, X. & Du, D. Prediction of CO2 solubility in NaCl brine under geological conditions with an improved binary interaction parameter in the Søreide-Whitson model. Geothermics 105, 102544. https://doi.org/10.1016/j.geothermics.2022.102544 (2022).
Le Van, S. & Chon, B. H. Applicability of an artificial neural network for predicting water-alternating-CO2 performance. Energies 10(7), 842 (2017).
Dong, P., Liao, X., Chen, Z. & Chu, H. An improved method for predicting CO2 minimum miscibility pressure based on artificial neural network. Adv. Geo-Energy Res. 3(4), 355–364. https://doi.org/10.26804/ager.2019.04.02 (2019).
Vo Thanh, H., Sugai, Y. & Sasaki, K. Application of artificial neural network for predicting the performance of CO2 enhanced oil recovery and storage in residual oil zones. Sci. Rep. 10(1), 18204. https://doi.org/10.1038/s41598-020-73931-2 (2020).
Le Van, S. & Chon, B. H. Evaluating the critical performances of a CO2-enhanced oil recovery process using artificial neural network models. J. Petrol. Sci. Eng. 157, 207–222. https://doi.org/10.1016/j.petrol.2017.07.034 (2017).
Kim, Y., Jang, H., Kim, J. & Lee, J. Prediction of storage efficiency on CO2 sequestration in deep saline aquifers using artificial neural network. Appl. Energy 185, 916–928. https://doi.org/10.1016/j.apenergy.2016.10.012 (2017).
Ibrahim, A. F. Application of various machine learning techniques in predicting coal wettability for CO2 sequestration purpose. Int. J. Coal Geol. 252, 103951. https://doi.org/10.1016/j.coal.2022.103951 (2022).
Lv, Q. et al. On the evaluation of coal strength alteration induced by CO2 injection using advanced black-box and white-box machine learning algorithms. SPE J. https://doi.org/10.2118/218403-pa (2024).
Ghasemian, N., Kalbasi, M. & Pazuki, G. Experimental study and mathematical modeling of solubility of CO2 in water: Application of artificial neural network and genetic algorithm. J. Dispersion Sci. Technol. 34(3), 347–355. https://doi.org/10.1080/01932691.2012.667293 (2013).
Hemmati-Sarapardeh, A., Amar, M. N., Soltanian, M. R., Dai, Z. & Zhang, X. Modeling CO2 solubility in water at high pressure and temperature conditions. Energy Fuels 34(4), 4761–4776. https://doi.org/10.1021/acs.energyfuels.0c00114 (2020).
Khoshraftar, Z. & Ghaemi, A. Prediction of CO2 solubility in water at high pressure and temperature via deep learning and response surface methodology. Case Stud. Chem. Environ. Eng. 7, 100338. https://doi.org/10.1016/j.cscee.2023.100338 (2023).
Jeon, P. R. & Lee, C.-H. Artificial neural network modelling for solubility of carbon dioxide in various aqueous solutions from pure water to brine. J. CO Utilization. 47, 101500. https://doi.org/10.1016/j.jcou.2021.101500 (2021).
Nakhaei-Kohani, R., Taslimi-Renani, E., Hadavimoghaddam, F., Mohammadi, M.-R. & Hemmati-Sarapardeh, A. Modeling solubility of CO2–N2 gas mixtures in aqueous electrolyte systems using artificial intelligence techniques and equations of state. Sci. Rep. 12(1), 3625. https://doi.org/10.1038/s41598-022-07393-z (2022).
Mohammadian, E., Liu, B. & Riazi, A. Evaluation of different machine learning frameworks to estimate CO2 solubility in NaCl brines: Implications for CO2 injection into low-salinity formations. Lithosphere. 12, 1615832 (2022).
Menad, N. A., Hemmati-Sarapardeh, A., Varamesh, A. & Shamshirband, S. Predicting solubility of CO2 in brine by advanced machine learning systems: Application to carbon capture and sequestration. J. CO2 Utilization. 33, 83–95. https://doi.org/10.1016/j.jcou.2019.05.009 (2019).
Raji, M., Dashti, A., Amani, P. & Mohammadi, A. H. Efficient estimation of CO2 solubility in aqueous salt solutions. J. Mol. Liquids 283, 804–815. https://doi.org/10.1016/j.molliq.2019.02.090 (2019).
Rostami, A., Arabloo, M., Kamari, A. & Mohammadi, A. H. Modeling of CO2 solubility in crude oil during carbon dioxide enhanced oil recovery using gene expression programming. Fuel 210, 768–782. https://doi.org/10.1016/j.fuel.2017.08.110 (2017).
Rostami, A., Arabloo, M., Lee, M. & Bahadori, A. Applying SVM framework for modeling of CO2 solubility in oil during CO2 flooding. Fuel 214, 73–87. https://doi.org/10.1016/j.fuel.2017.10.121 (2018).
Nakhaei-Kohani, R. et al. Chemical structure and thermodynamic properties based models for estimating nitrous oxide solubility in ionic Liquids: Equations of state and Machine learning approaches. J. Mol. Liquids 367, 120445. https://doi.org/10.1016/j.molliq.2022.120445 (2022).
Austin, W. H., Lacombe, E., Rand, P. W. & Chatterjee, M. Solubility of carbon dioxide in serum from 15 to 38 C. J. Appl. Physiol. 18(2), 301–304. https://doi.org/10.1152/jappl.1963.18.2.301 (1963).
Bando, S., Takemura, F., Nishio, M., Hihara, E. & Akai, M. Solubility of CO2 in aqueous solutions of NaCl at (30 to 60) °C and (10 to 20) MPa. J. Chem. Eng. Data 48(3), 576–579. https://doi.org/10.1021/je0255832 (2003).
Briones, J. A., Mullins, J. C., Thies, M. C. & Kim, B. U. Ternary phase equilibria for acetic acid-water mixtures with supercritical carbon dioxide. Fluid Phase Equilibria 36, 235–246. https://doi.org/10.1016/0378-3812(87)85026-4 (1987).
Chapoy, A., Mohammadi, A. H., Chareton, A., Tohidi, B. & Richon, D. Measurement and modeling of gas solubility and literature review of the properties for the carbon dioxide−water system. Ind. Eng. Chem. Res. 43(7), 1794–1802. https://doi.org/10.1021/ie034232t (2004).
Dodds, W. S., Stutzman, L. F. & Sollami, B. J. Carbon dioxide solubility in water. Ind. Eng. Chem. Chem. Eng. Data Series 1(1), 92–95. https://doi.org/10.1021/i460001a018 (1956).
Ferrentino, G., Barletta, D., Donsì, F., Ferrari, G. & Poletto, M. Experimental measurements and thermodynamic modeling of CO2 solubility at high pressure in model apple juices. Ind. Eng. Chem. Res. 49(6), 2992–3000. https://doi.org/10.1021/ie9009974 (2010).
Findlay, A. & Howell, O. R. XXXII—The solubility of carbon dioxide in water in the presence of starch. J. Chem. Soc. Trans. 107, 282–284. https://doi.org/10.1039/CT9150700282 (1915).
Findlay, A. & Williams, T. LXXI—The influence of colloids and fine suspensions on the solubility of gases in water. Part III. Solubility of carbon dioxide at pressures lower than atmospheric. J. Chem. Society Trans. 103, 636–645. https://doi.org/10.1039/CT9130300636 (1913).
Han, J. M., Shin, H. Y., Min, B.-M., Han, K.-H. & Cho, A. Measurement and correlation of high pressure phase behavior of carbon dioxide+water system. J. Ind. Eng. Chem. 15(2), 212–216. https://doi.org/10.1016/j.jiec.2008.09.012 (2009).
Koschel, D., Coxam, J.-Y., Rodier, L. & Majer, V. Enthalpy and solubility data of CO2 in water and NaCl(aq) at conditions of interest for geological sequestration. Fluid Phase Equilibria 247(1), 107–120. https://doi.org/10.1016/j.fluid.2006.06.006 (2006).
Li, Y.-H. & Tsui, T.-F. The solubility of CO2 in water and sea water. J. Geophys. Res. (1896–1977) 76(18), 4203–4207. https://doi.org/10.1029/JC076i018p04203 (1971).
Li, Z., Dong, M., Li, S. & Dai, L. Densities and solubilities for binary systems of carbon dioxide + water and carbon dioxide + brine at 59 °C and pressures to 29 MPa. J. Chem. Eng. Data 49(4), 1026–1031. https://doi.org/10.1021/je049945c (2004).
Markham, A. E. & Kobe, K. A. The solubility of carbon dioxide and nitrous oxide in aqueous salt solutions. J. Am. Chem. Society 63(2), 449–454. https://doi.org/10.1021/ja01847a027 (1941).
Matouš, J., Šobr, J., Novak, J. & Pick, J. Solubility of carbon dioxide in water at pressures up to 40 atm. Collect. Czechoslovak Chem. Commun. 34(12), 3982–3985 (1969).
Morgan, O. M. & Maass, O. An investigation of the equilibria existing in gas-water systems forming electrolytes. Can. J. Res. 5(2), 162–199. https://doi.org/10.1139/cjr31-062 (1931).
Morrison, T. J. & Billett, F. The salting-out of non-electrolytes. Part II. The effect of variation in non-electrolyte. J. Chem. Society (Resumed). 3819, 3822. https://doi.org/10.1039/JR9520003819 (1952).
Müller, G., Bender, E. & Maurer, G. Das Dampf-Flüssigkeitsgleichgewicht des ternären Systems Ammoniak-Kohlendioxid-Wasser bei hohen Wassergehalten im Bereich zwischen 373 und 473 Kelvin. Berichte der Bunsengesellschaft für physikalische Chemie 92(2), 148–160. https://doi.org/10.1002/bbpc.198800036 (1988).
Nighswander, J. A., Kalogerakis, N. & Mehrotra, A. K. Solubilities of carbon dioxide in water and 1 wt. % sodium chloride solution at pressures up to 10 MPa and temperatures from 80 to 200.degree.C. J. Chem. Eng. Data 34(3), 355–360. https://doi.org/10.1021/je00057a027 (1989).
Novak, J., Fried, V. & Pick, J. Löslichkeit des Kohlendioxyds in Wasser bei verschiedenen Drücken und Temperaturen. Collect. Czechoslovak Chem. Commun. 26(9), 2266–2270 (1961).
Qin, J., Rosenbauer, R. J. & Duan, Z. Experimental measurements of vapor-liquid equilibria of the H2O + CO2 + CH4 ternary system. J. Chem. Eng. Data 53(6), 1246–1249. https://doi.org/10.1021/je700473e (2008).
Sako, T. et al. Phase equilibrium study of extraction and concentration of furfural produced in reactor using supercritical carbon dioxide. J. Chem. Eng. Japan 24(4), 449–455. https://doi.org/10.1252/jcej.24.449 (1991).
Servio, P. & Englezos, P. Effect of temperature and pressure on the solubility of carbon dioxide in water in the presence of gas hydrate. Fluid Phase Equilibria 190(1), 127–134. https://doi.org/10.1016/S0378-3812(01)00598-2 (2001).
Shedlovsky, T. & MacInnes, D. A. The first ionization constant of carbonic acid, 0 to 38°, from conductance measurements. J. Am. Chem. Society 57(9), 1705–1710. https://doi.org/10.1021/ja01312a064 (1935).
Takenouchi, S. & Kennedy, G. C. The binary system H2O–CO2 at high temperatures and pressures. Am. J. Sci. 262(9), 1055–1074 (1964).
Teng, H., Yamasaki, A., Chun, M. K. & Lee, H. Solubility of liquid CO2 in water at temperatures from 278 K to 293 K and pressures from 6.44 MPa to 29.49 MPa and densities of the corresponding aqueous solutions. J. Chem. Thermodyn. 29(11), 1301–1310. https://doi.org/10.1006/jcht.1997.0249 (1997).
Valtz, A., Chapoy, A., Coquelet, C., Paricaud, P. & Richon, D. Vapour–liquid equilibria in the carbon dioxide–water system, measurement and modelling from 278.2 to 318.2K. Fluid Phase Equilibria 226, 333–344. https://doi.org/10.1016/j.fluid.2004.10.013 (2004).
Weiss, R. F. Carbon dioxide in water and seawater: The solubility of a non-ideal gas. Mar. Chem. 2(3), 203–215. https://doi.org/10.1016/0304-4203(74)90015-2 (1974).
Wiebe, R. & Gaddy, V. The solubility in water of carbon dioxide at 50, 75 and 100, at pressures to 700 atmospheres. J. Am. Chem. Society 61(2), 315–318 (1939).
Wiebe, R. & Gaddy, V. L. Vapor phase composition of carbon dioxide-water mixtures at various temperatures and at pressures to 700 atmospheres. J. Am. Chem. Society 63(2), 475–477. https://doi.org/10.1021/ja01847a030 (1941).
Yeh, S.-Y. & Peterson, R. E. Solubility of carbon dioxide, krypton, and xenon in aqueous solution. J. Pharm. Sci. 53(7), 822–824. https://doi.org/10.1002/jps.2600530728 (1964).
Zawisza, A. & Malesinska, B. Solubility of carbon dioxide in liquid water and of water in gaseous carbon dioxide in the range 0.2–5 MPa and at temperatures up to 473 K. J. Chem. Eng. Data 26(4), 388–391. https://doi.org/10.1021/je00026a012 (1981).
Zheng, D.-Q., Guo, T.-M. & Knapp, H. Experimental and modeling studies on the solubility of CO2, CHC1F2, CHF3, C2H2F4 and C2H4F2 in water and aqueous NaCl solutions under low pressures. Fluid Phase Equilibria 129(1), 197–209. https://doi.org/10.1016/S0378-3812(96)03177-9 (1997).
Liu, Y., Hou, M., Yang, G. & Han, B. Solubility of CO2 in aqueous solutions of NaCl, KCl, CaCl2 and their mixed salts at different temperatures and pressures. J. Supercrit. Fluids 56(2), 125–129. https://doi.org/10.1016/j.supflu.2010.12.003 (2011).
Belyadi, H. & Haghighat, A. Machine Learning Guide for Oil and Gas Using Python (Elsevier, New York, 2021).
Natekin, A. & Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 7, 21 (2013).
Bentéjac, C., Csörgő, A. & Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 54(3), 1937–1967 (2021).
Chahar, J., Verma, J., Vyas, D. & Goyal, M. Data-driven approach for hydrocarbon production forecasting using machine learning techniques. J. Petrol. Sci. Eng. 217, 110757 (2022).
Otchere, D. A., Ganat, T. O. A., Ojero, J. O., Tackie-Otoo, B. N. & Taki, M. Y. Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions. J. Petrol. Sci. Eng. 208, 109244 (2022).
J. H. Friedman. Greedy function approximation: A gradient boosting machine. Ann. Stat. 1189–1232 (2001).
Larestani, A., Mousavi, S. P., Hadavimoghaddam, F. & Hemmati-Sarapardeh, A. Predicting formation damage of oil fields due to mineral scaling during water-flooding operations: Gradient boosting decision tree and cascade-forward back-propagation network. J. Petrol. Sci. Eng. 208, 109315 (2022).
Fan, J. et al. Light gradient boosting machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data. Agric. Water Manag. 225, 105758 (2019).
Gu, Y. et al. Data-driven estimation for permeability of simplex pore-throat reservoirs via an improved light gradient boosting machine: A demonstration of sand-mud profile, Ordos Basin, northern China. J. Petrol. Sci. Eng. 217, 110909 (2022).
Gu, Y., Zhang, D., Lin, Y., Ruan, J. & Bao, Z. Data-driven lithology prediction for tight sandstone reservoirs based on new ensemble learning of conventional logs: A demonstration of a Yanchang member, Ordos Basin. J. Petrol. Sci. Eng. 207, 109292 (2021).
Taha, A. A. & Malebary, S. J. An intelligent approach to credit card fraud detection using an optimized light gradient boosting machine. IEEE Access 8, 25579–25587 (2020).
Mahdaviara, M., Sharifi, M., Bakhshian, S. & Shokri, N. Prediction of spontaneous imbibition in porous media using deep and ensemble learning techniques. Fuel 329, 125349 (2022).
Zhou, B. et al. Pressure of different gases injected into large-scale coal matrix: analysis of time–space dependence and prediction using light gradient boosting machine. Fuel 279, 118448 (2020).
Hosseinzadeh, M. & Hemmati-Sarapardeh, A. Toward a predictive model for estimating viscosity of ternary mixtures containing ionic liquids. J. Mol. Liq. 200, 340–348. https://doi.org/10.1016/j.molliq.2014.10.033 (2014).
Sabirzyanov, A. N., Il’in, A. P., Akhunov, A. R. & Gumerov, F. M. Solubility of water in supercritical carbon dioxide. High Temperature 40(2), 203–206. https://doi.org/10.1023/A:1015294905132 (2002).
P. Rousseeuw & A. Leroy. Robust regression and outlier detection: Wiley Interscience. New York (1987).
Rousseeuw, P. J. & van Zomeren, B. C. Unmasking multivariate outliers and leverage points. J. Am. Stat. Assoc. 85(411), 633–639. https://doi.org/10.1080/01621459.1990.10474920 (1990).
C. R. Goodall. 13 computation using the QR decomposition. in Handbook of Statistics, vol. 9. (Elsevier, 1993), 467–508.
Gramatica, P. Principles of QSAR models validation: Internal and external. QSAR Combinatorial Sci. 26(5), 694–701 (2007).
A. H. Sarapardeh, A. Larestani, N. A. Menad, & S. Hajirezaie. Applications of Artificial Intelligence Techniques in the Petroleum Industry. (Gulf Professional Publishing, 2020).
Lv, Q. et al. Application of group method of data handling and gene expression programming to modeling molecular diffusivity of CO2 in heavy crudes. Geoenergy Sci. Eng. 237, 212789. https://doi.org/10.1016/j.geoen.2024.212789 (2024).
Lv, Q. et al. Modelling CO2 diffusion coefficient in heavy crude oils and bitumen using extreme gradient boosting and Gaussian process regression. Energy 275, 127396. https://doi.org/10.1016/j.energy.2023.127396 (2023).
Salehi, K., Rahmani, M. & Atashrouz, S. Machine learning assisted predictions for hydrogen storage in metal-organic frameworks. Int. J. Hydrogen Energy 48(85), 33260–33275. https://doi.org/10.1016/j.ijhydene.2023.04.338 (2023).
Zheng, H., Mahmoudzadeh, A., Amiri-Ramsheh, B. & Hemmati-Sarapardeh, A. Modeling viscosity of CO2–N2 Gaseous mixtures using robust tree-based techniques: extra tree, random forest, GBoost, and LightGBM. ACS Omega 8(15), 13863–13875. https://doi.org/10.1021/acsomega.3c00228 (2023).
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Contributions
A.M.: Writing – original draft, Methodology, Investigation, Visualization, data, Software B.A.: Writing – original draft, Methodology, Visualization, Software, S.A.: Writing-Review & Editing, Validation, Conceptualization, Methodology A.A.: Writing-Review & Editing, Validation, Conceptualization, M.A.: Validation, Conceptualization, data, M.O.: Methodology, Validation, Supervision, Writing-Review & Editing. A.M.: Writing-Review & Editing, Validation, Conceptualization, A.H.: Methodology, Validation, Supervision, Writing-Review & Editing, Software.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mahmoudzadeh, A., Amiri-Ramsheh, B., Atashrouz, S. et al. Modeling CO2 solubility in water using gradient boosting and light gradient boosting machine. Sci Rep 14, 13511 (2024). https://doi.org/10.1038/s41598-024-63159-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-63159-9
- Springer Nature Limited