Investigating the effect of textural properties on CO2 adsorption in porous carbons via deep neural networks using various training algorithms

Mehrmohammadi, Pardis; Ghaemi, Ahad

doi:10.1038/s41598-023-48683-4

Investigating the effect of textural properties on CO₂ adsorption in porous carbons via deep neural networks using various training algorithms

Article
Open access
Published: 02 December 2023

Volume 13, article number 21264, (2023)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Investigating the effect of textural properties on CO₂ adsorption in porous carbons via deep neural networks using various training algorithms

Download PDF

Pardis Mehrmohammadi¹ &
Ahad Ghaemi¹

1739 Accesses
1 Altmetric
Explore all metrics

Abstract

The adsorption of carbon dioxide (CO₂) on porous carbon materials offers a promising avenue for cost-effective CO₂ emissions mitigation. This study investigates the impact of textural properties, particularly micropores, on CO₂ adsorption capacity. Multilayer perceptron (MLP) neural networks were employed and trained with various algorithms to simulate CO₂ adsorption. Study findings reveal that the Levenberg–Marquardt (LM) algorithm excels with a remarkable mean squared error (MSE) of 2.6293E−5, indicating its superior accuracy. Efficiency analysis demonstrates that the scaled conjugate gradient (SCG) algorithm boasts the shortest runtime, while the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm requires the longest. The LM algorithm also converges with the fewest epochs, highlighting its efficiency. Furthermore, optimization identifies an optimal radial basis function (RBF) network configuration with nine neurons in the hidden layer and an MSE of 9.840E−5. Evaluation with new data points shows that the MLP network using the LM and bayesian regularization (BR) algorithms achieves the highest accuracy. This research underscores the potential of MLP deep neural networks with the LM and BR training algorithms for process simulation and provides insights into the pressure-dependent behavior of CO₂ adsorption. These findings contribute to our understanding of CO₂ adsorption processes and offer valuable insights for predicting gas adsorption behavior, especially in scenarios where micropores dominate at lower pressures and mesopores at higher pressures.

Machine learning analysis and prediction of N2, N2O, and O2 adsorption on activated carbon and carbon molecular sieve

Article 13 August 2022

Direct prediction of gas adsorption via spatial atom interaction learning

Article Open access 03 November 2023

A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks

Article Open access 01 March 2024

Introduction

Recent years have seen a sharp rise in the total amount of carbon dioxide in the atmosphere, causing numerous environmental concerns¹. CO₂ has been identified as the predominant greenhouse gas in the atmosphere and as the primary cause of climate change^2,3,4. Human activities, including the combustion of fossil fuels, agricultural practices, and industrial processes, are the primary sources of CO₂ emissions into the atmosphere⁵. Therefore, efficient carbon capture and storage techniques are crucial for environmental protection⁶.

Carbon dioxide capture technology can be categorized into five primary classes, which are cryogenic distillation^7,8,9,10, membrane^11,12,13,14, absorption^15,16,17,18, and adsorption^19,20,21,22. The implementation of membrane, cryogenic, and absorption technologies has encountered significant obstacles due to their high cost²³, environmental concerns²⁴, and substantial energy consumption²⁵. In contrast, adsorption technology is not only recognized as an environmentally sustainable method, but it also has a notable characteristic of consuming less energy during the separation of carbon dioxide, hence demonstrating high selectivity and efficiency²⁶. Recently, carbon capture and storage (CCS) has garnered considerable attention due to its applications²⁷.

Commercial adsorbents used in CO₂ adsorption methods include porous carbon^28,29,30, zeolites^31,32,33, and metal–organic frameworks (MOFs)^34,35,36. Despite the high CO₂ adsorption capacities of zeolites and MOFs, these extremely hydrophilic materials' adsorption capacities might drastically decrease in the presence of water^37,38. However, new MOFs have recently been synthesized that are stable against water³⁹. In contrast, porous carbon is hydrophobic, has a large surface area, and demonstrates exceptional thermal and chemical stability. In addition, it is simple to prepare, has a pore size that can be controlled, and is inexpensive. The structure and porosity of the resulting porous carbon are determined by the kind of carbon precursor and preparation technique used⁴⁰.

Numerous investigations have been allocated to the development of porous carbons with a high pore volume and specific surface aim to enhance CO₂ adsorption^29,41. The main contributor to CO₂ adsorption is now generally acknowledged to be narrow micropores; nevertheless, the precise link between pore structure and CO₂ adsorption is still unknown^42,43,44. There have previously been phenomenological efforts to investigate the effect of textural property on CO₂ adsorption capacity. Durá et al., for example, suggested a regression equation to correlate CO₂ adsorption with the volume of micropores mesopores⁴⁵. As shown in Eq. (1), CO₂ adsorption was predicted using a partial least square regression approach. Although this regression equation has the ability to produce a reasonable estimate, the regression polynomial method makes it challenging to obtain an ideal empirical formula. In addition, this regression equation cannot be learned and developed as the database grows⁴⁶. Importantly, according to Eq. (1), ordered mesoporous carbon without micropores ought to have almost no capacity to adsorb CO₂, which is preposterous in some instances. In addition, the equation is derived from a small sample size, with only 12 varieties of porous carbon examined in their research. All of these samples were at a pressure of 5 bar and a temperature of 25 °C, so the operating conditions of the adsorbers that use this relationship are restricted to these conditions.

$$ M_{ads}^{{CO_{2} }} = 0.095 + 2.10 \times V_{micro} + 3.51 \times V_{micro} \times V_{meso} $$

(1)

So far, quantitative investigations utilizing the known textural parameters of porous carbons have successfully enabled precise forecasts of CO₂ adsorption. Shen et al.⁴⁷ developed hierarchical porous activated carbon fibers, which exhibit faster adsorption rates and higher capacity when compared to pure carbon materials. Xia et al.'s⁴⁸ study shows high surface area, microporous, nitrogenous properties, and zeolite pattern carbons have the highest CO₂ adsorption capacity. Casco et al.⁴⁹ observed that activated carbons derived from crude oil and potassium hydroxide display exceptional CO₂ adsorption performance in atmospheric and high-pressure conditions. The optimal carbon structure is contingent upon the specific application, with narrow micropores exerting control and ensuring high delivery capacity. Based on the review of existing literature, it is notable that there is a lack of a comprehensive and highly accurate model that can effectively predict the amount of carbon dioxide adsorbed by the textural properties of adsorbents and the operating conditions. Deep learning (DL) models are frequently regarded as a robust approach for modeling purposes, owing to their capacity to yield highly accurate predictions. However, it is notable that these models cannot offer a quantitative analysis of each input, as they operate based on the concept of black boxes.

In recent years, deep learning has exhibited significant potential in addressing a multitude of material research-related challenges^50,51,52,53. Specifically, research endeavors have been undertaken within the domain of neural network modeling and simulation to study carbon dioxide adsorption processes. Dashti et al.⁵⁴ created an MLP network to assess the adsorption of pure gases on activated carbon and zeolite-5A. They developed accurate models using input parameters such as temperature, pressure, pore size, and kinetic diameter. To optimize the model, various hidden layers were constructed, and the dataset's AARD% value was used to evaluate performance. Fotoohi et al.⁵⁵ used four two-dimensional equations of state to assess the adsorption of pure and binary gases onto activated carbons. They applied the LM algorithm for model learning and the ANN method for prediction. Compared to two-dimensional state equations, the optimal architecture was 1-6-7 for pure gas adsorption and 1-7-9 for binary gas adsorption, with a higher precision. Iraji et al.⁵⁶ surveyed the adsorption of CO₂ and SO₂ on modified carbon nanotubes. They proposed an MLP neural network model with three hidden layers and 10 neurons, which was trained with the LM training algorithm. Lepri et al.⁵⁷ developed reduced-order PSA models using artificial neural networks that demonstrated excellent agreement between ANN and simulation results, allowing for their implementation in optimization environments for PSA cycle synthesis. Meng et al.⁵⁸ explored the adsorption of supercritical CO₂ and CH₄ on coal using conventional isotherm models. In addition, they proposed a novel machine learning model based on a 15-neuron neural network. Zhang et al.⁴⁴ predicted the CO₂ adsorption capacity of porous carbons using the DL algorithm with five input parameters: specific surface area, micropore volume, mesopore volume, temperature, and pressure. Also, the highest prediction accuracy was achieved when all three textural properties were used. At low pressure, microporous porous carbon adsorbs more CO₂, whereas hierarchical porous carbon adsorbs more at high pressure. However, the DNN model presented in this research was prepared by MATLAB toolbox, which due to its limitations, such as creating a model with only two hidden layers with an equal number of neurons, the impossibility of changing the default activation functions for the layers cannot be an optimal model.

Table 1 presents a summary of further research on the ANN models for simulating CO₂ adsorption processes. It can be seen that the diversity of training algorithms applied to train MLP neural networks in CO₂ adsorption processes is comparatively limited when compared to other processes, with Levenberg–Marquardt (LM) and Bayesian regularization (BR) algorithms being the primary choices.

Table 1 Some studies carried out in the application of neural networks on CO₂ adsorption.

Full size table

To fill the aforementioned gaps, MLP neural network and, for the first time, RBF were used to model and simulate 1345 collected data representing more than 200 distinct porous carbon adsorbents to predict the amount of CO₂ adsorbed. Inputs for these models included adsorbent textural properties such as BET surface area, mesopore volume, micropore volume, and temperature and pressure as operating conditions. According to the literature, the impact of textural properties and adsorption conditions on CO₂ adsorption are not independent; therefore, the Pearson correlation coefficient analysis was used to investigate the primary linear relationships between any two variables as a preliminary step. In the MLP neural network, 13 distinct training algorithms and four activation function combinations of hidden layers were applied to each algorithm to optimize the network. The accuracy, run duration, and number of epochs are then used as criteria for comparing the models in order to choose the most optimal MLP model. The present study conducted a thorough evaluation of the factors that influence CO₂ adsorption, and it also highlighted the gap in prior research by addressing the lack of relevant analysis in previous studies. This evaluation was based on the results obtained from a simulation, highlighting the significance of understanding the impact of numerous factors on the adsorption process.

Data gathering and preparation

More than 150 papers were screened to compile data for this study. The data used for modeling and simulation was selected from literature containing over 200 adsorbents operating at various temperatures and pressures. Some of these data were collected by Zhang et al.⁴⁴ but this study added 325 new data to their collection. The BET surface, mesopore volume, micropore volume, temperature, pressure, and the amount of carbon dioxide adsorbed are presented in Table 2 acquiring a precise neural network model requires a large quantity of high-quality data. A standard for gathering data has been established in order to reduce the possibility of error caused by varying approaches to calculating the parameters influencing adsorption. The composition of the adsorbent contains either zero or a negligible amount of nitrogen. Nitrogen-doped carbon-based adsorbents demonstrate superior CO₂ adsorption capacity and heightened adsorption selectivity compared to their non-nitrogen counterparts. This improvement can be attributed to the significant enhancement of base adsorption sites within these adsorbents due to the presence of nitrogen⁶⁶. Therefore, to prevent errors, nitrogen-free sorbents were investigated. All specific surface areas were calculated using the BET equation from nitrogen adsorption at a temperature of 77 K. In this database, the total volume of the adsorbent was estimated from the absorption of liquid nitrogen at a relative pressure of 0.95–0.99. The Dubinin–Radushkevich (D–R) equation was used to obtain the volume of micropores. The volume of mesopores is calculated by subtracting the volume of micropores from the total volume of the adsorbent. The unit of BET surface area was square meters per gram (m²/g), the volume of micropores and mesopores was cubic centimeters per gram (cm³/g), the temperature was in degrees Celsius (C), the pressure was in bar, and the amount of carbon dioxide adsorption is based on millimoles of adsorbent per gram (mmol/g).

Table 2 The range of data employed in this study.

Full size table

The Pearson correlation coefficient matrix is the ratio of the covariance of each pair of adsorbent variables to the product of their standard deviations. Based on the Pearson correlation coefficient matrix (Fig. 1), there is no significant linear correlation between adsorbent textural properties and CO₂ uptake capacity. The correlation between adsorption capacity and the volume of mesopores and micropores (R = 0.017 and R = 0.147 respectively) indicates a weak relationship. Moreover, there was a strong positive correlation (R = 0.807) between micropore volume and BET surface. However, CO₂ adsorption capacity demonstrated a robust positive correlation with pressure (R = 0.776) and a moderately poor relationship with temperature (R = − 0.238).

In this study, 1345 data points were acquired, of which 1300 were used to develop the artificial neural network model. In addition, the 45 data were chosen randomly to predict carbon dioxide adsorption. MATLAB software arbitrarily separated 80% of the 1300 data points for training, 10% for validation, and another 10% for testing. Selecting the proper input and output is one of the important stages in creating a neural network model. According to what was stated in the introduction, previous research has demonstrated that in addition to operational parameters like temperature and pressure, the textural properties of the adsorbent, including the BET surface area, volume, especially the volume of mesopores, and the volume of micropores, significantly influence the adsorption process. As a result, the variables of BET surface, mesopore volume, micropore volume, temperature, and pressure are chosen as network input variables. The objective of creating a model is to predict the carbon dioxide adsorption capacity. Therefore, the amount of carbon dioxide adsorbed is deemed the network's output. To reduce the impact of parameters with greater magnitude on the ANN design, the entire database was standardized in the range of 0–1 (Supplementary Information).

Theory and methodology

Artificial neural networks

An artificial neural network (ANN) is a computational model that employs the architecture of the human brain to predict intricate and non-linear systems. Within the network's structure, artificial neurons are interconnected across the input, hidden, and output layers⁹⁰. The neural network is exposed to input–output pairs and undergoes training to predict the output variables. The training process establishes the connection strengths among the processing neurons through the utilization of an appropriate training algorithm. The biases between the layers and the connectivity weights, thus, influence the input signals. An activation function is employed to modify the sum of these signals, aiming to minimize the disparity between the predicted output and the actual output data. The commonly utilized activation functions include purelin (Eq. 2), logsig (Eq. 3), and tansig (Eq. 4).

$$ {\text{f(x}}) = x $$

(2)

$$ {\text{f(x)}} = \frac{1}{{1 + e^{ - n} }} $$

(3)

$$ {\text{f(x) = }}\frac{2}{{1 + e^{ - 2n} }} - 1 $$

(4)

Prominent types of ANNs include the radial basis function (RBF) and the multilayer perceptron (MLP)⁹¹. It is crucial to highlight the key distinction between these networks, which lies in the functioning of neurons.

The RBF-ANN architecture consists of an input layer, a hidden layer, and an output layer. The neurons within the hidden layer utilize radial basis functions as their activation functions. Through the utilization of a linear optimization strategy and the adjustment of weights during the minimization of mean square error, this algorithm can ascertain the optimal solution.

As mentioned earlier, the multilayer perceptron (MLP) is an alternative form of ANN. This algorithm comprises multiple layers, with the input layer being the first and the output layer being the last. Intermediate and hidden layers connect the input and output layers, where various forms of activation functions can be applied⁹². Additional information about MLP and RBF algorithms can be found in the literature^93,94.

MLP training algorithms

The algorithm's learning process involves the forward propagation of data and the backward propagation of errors. Input data enters the model through the input layer without initial processing, undergoes initial processing in the hidden layer, and then gets to the output layer. If the difference between the network's predictions and the actual outputs does not meet the required level of accuracy, an error will be backpropagated through the network for further adjustment.

The backpropagation of errors works by propagating the difference between the network's output and the actual output back through the hidden layer to the input layer. The network's training procedure continues until the error between the network result and the actual output falls within the allowed tolerance or reaches a predetermined number of learning cycles²⁸.

Furthermore, there are six distinct classes of backpropagation algorithms: adaptive momentum, self-adaptive learning rate, resilient backpropagation, conjugate gradient, Quasi-Newton, and Bayesian regularization⁹⁵.

Adaptive momentum

The gradient descent (GD) algorithm is employed to determine an optimal set of internal variables for model optimization in machine learning and deep learning problems. Typically, gradient descent involves three steps: (1) initializing the internal variables, (2) evaluating the model using the internal variables and the loss function, and (3) updating the internal variables in a manner that moves toward optimal points. The gradient descent technique involves an iterative process as shown in Eq. (5).

$$ w^{i + 1} = w^{i} - \nabla f(w^{i} )\eta^{i} $$

(5)

In the given equation, wⁱ represents the set of variables that need to be updated, ∇f(wⁱ) represents the gradient of the loss function f concerning the set of variables wⁱ, and η denotes the learning rate. The value of η can be a constant or determined through a one-dimensional optimization along the training direction at each step. The primary objective of gradient descent is to locate the global minimum points that minimize or maximize the loss function, making this process essential in the optimization procedure for the loss function⁹⁶.

Gradient descent with momentum backpropagation (GDM) is a training algorithm that utilizes batch steepest descent with an enhanced convergence rate. It incorporates momentum to adapt to trends in local gradients and error surfaces, thereby mitigating the risk of getting stuck in shallow local minimums. By employing momentum, GDM achieves faster convergence during the training process⁹⁷.

Self-adaptive learning rate

The efficacy of the algorithm relies on the appropriate configuration of the learning rate. If the learning rate is set too high, it can result in instability, while setting it too low can lead to slow convergence. Determining the optimal learning rate prior to training is impractical, as it varies during the training process depending on the algorithm's progress across the performance surface.

To enhance the performance of the gradient descent algorithm, an adaptive learning rate can be utilized, allowing for adjustments during training. The primary objective of an adaptive learning rate is to maintain a maximal learning step size while ensuring stability in the learning process⁹⁸.

The conventional steepest descent (GD) backpropagation algorithm employs a fixed learning rate parameter during the network's training process. Nevertheless, the algorithm's performance significantly relies on the specific value assigned to this learning rate parameter. To address this, the gradient descent with an adaptive learning rate backpropagation (GDA) algorithm was created, enabling an adaptive adjustment of the learning rate parameter. This adaptive strategy strives to optimize the magnitude of each learning step while maintaining the stability of the learning process. In the GDA algorithm, the optimal value for the learning rate parameter varies depending on the trajectory of the gradient on the error⁹⁷. The training algorithm referred to as gradient descent with momentum and adaptive learning rate backpropagation (GDX) integrates both adaptive learning rate and momentum training principles. It is similar to the GDA algorithm, but with the addition of a momentum coefficient as a training parameter. Consequently, the weight vector is updated using a similar approach as in GDM, while incorporating a variable learning rate as seen in GDA.

Resilient backpropagation (RP)

Resilient backpropagation (RP) is typically applied to eliminate the negative consequences of partial derivative values. This algorithm has the advantage of being significantly quicker than the standard reduction algorithm⁹⁵. In the hidden layers of multilayer networks, sigmoid activation functions are typically utilized to restrict the output range. As the input value increases, the slope of the Sigmoid functions tends to approach zero. This poses a challenge when training a multilayer network using gradient descent and Sigmoid functions, as the gradients can become exceedingly small, resulting in weight and bias updates that deviate significantly from the optimal values. The resilient backpropagation training algorithm aims to mitigate the adverse impact of small partial derivatives. In this approach, only the sign of the derivative is utilized to determine the weight update direction, while the derivative's actual value does not affect the weight update. The magnitude of the weight change is determined by a distinct update value⁹⁸.

Conjugate gradient

The conjugate gradient algorithm, which combines elements of gradient descent and Newton's method, enhances the convergence rate of artificial neural networks by eliminating the requirement to measure, store, and invert the Hessian matrix. It explores conjugate directions in a coordinated manner, leading to faster convergence compared to the directions followed by gradient descent. The algorithm establishes the sequence of training directions using the equation provided below.

$$ y^{i + 1} = v^{i + 1} + y^{i} c^{i} $$

(6)

Utilizing the primary training direction vector

$$ y^{0} = - v^{0} $$

(7)

where y represents the training direction vector, c denotes the conjugate parameter, and i is set as the negation of the gradient in all scenarios⁹⁶. The conjugate gradient algorithm's parameter improvement procedure is defined by

$$ w^{i + 1} = w^{i} + y^{i} \eta^{i} $$

(8)

where ${\upeta }^{{\text{i}}}$ is the learning rate, which is determined by line minimization normally.

The standard backpropagation algorithm modifies weights in the direction of the steepest descent, but this does not guarantee the quickest convergence. Conjugate gradient algorithms expedite convergence by exploring conjugate directions. The initial iterations involve performing the steepest descent, conducting line searches, and combining the direction with the previous search direction. The determination of the new search direction depends on a constant value, which is calculated in the Fletcher–Reeves update[conjugate gradient backpropagation with Fletcher–Reeves update (CGF) algorithm] as the difference between the squared norm of the current gradient and the squared norm of the previous gradient⁹⁹. The constant employed to determine the updated search direction in the Polak–Ribiére update, as part of the conjugate gradient backpropagation with the Polak–Ribiére update (CGP) algorithm, is calculated as the inner product of the previous gradient change and the current gradient, divided by the squared norm of the previous gradient. In contrast to the Fletcher–Reeves method, which involves three vectors, the Polak–Ribiére update requires a marginally higher storage capacity⁹⁷.

In every conjugate gradient algorithm, the search direction is regularly reset to the inverse of the gradient. While other reset techniques can enhance training effectiveness, the typical reset point occurs when the number of iterations matches the number of network parameters (weights and biases). The Powell–Beale restarts [conjugate gradient backpropagation with Powell–Beale restarts (CGB) algorithm] is an example of such a reset technique. Powell introduced a restart strategy for enhancing training effectiveness, building upon a suggestion from Bill. This strategy triggers a restart when there is limited orthogonality between the current gradient and the previous gradient. Unlike Polak–Ribiére, the Powell–Beale algorithm demands slightly greater storage capacity⁹⁷.

The three previously discussed conjugate gradient techniques require a line search after each iteration, which can be computationally expensive due to the need to compute the network output for all training inputs multiple times. To address this issue and significantly reduce the number of calculations required per iteration, the scaled conjugate gradient backpropagation (SCG) training technique was developed. However, SCG may require more iterations than the other conjugate gradient algorithms to achieve convergence. The storage requirements of the SCG algorithm are comparable to those of the CGF algorithm. In the majority of problems, SCG yields a superlinear convergent. It is at least an order of magnitude quicker than the backpropagation algorithm in terms of performance. Using a mechanism for resizing the step size, SCG avoids a lengthy search by row for learning iterations, making the algorithm speedier than other recently suggested second-order algorithms⁹⁷.

Quasi-Newton

Quasi-Newton methods, a subset of variable metric techniques, are employed to identify local extremum points of functions. These methods draw their inspiration from Newton's method, designed to pinpoint stationary points of a function where the gradient equals zero. Newton's method assumes that the function can be locally approximated as a quadratic function in the vicinity of the optimal point. To accomplish this, it relies on the utilization of both the first and second derivatives of the function. In cases involving higher dimensions, Newton's method extends its application by incorporating the gradient and the Hessian matrix, which encapsulates the second derivatives of the function, with the objective of function minimization¹⁰⁰.

Newton's method presents an alternative to conjugate gradient methods, known for its rapid convergence and optimization capabilities. It involves the computation of the Hessian Matrix, which leads to faster convergence compared to conjugate gradient methods. However, calculating the Hessian matrix for feedforward neural networks is challenging and computationally expensive. The BFGS Quasi-Newton backpropagation (BFGS) algorithm is well-suited for smaller networks, although it requires more storage and computational resources due to its complexity and cost⁹⁵.

The one step secant backpropagation (OSS) training technique strikes a balance between conjugate gradient algorithms and full quasi-Newton algorithms. It offers reduced storage and computational requirements per iteration by computing the Hessian matrix only once per epoch and retaining it throughout the iteration. This approach determines the new search direction without explicitly calculating the inverse matrix. However, it necessitates additional computational and storage resources per iteration compared to conjugate gradient methods^97,99

Levenberg–Marquardt backpropagation (LM)

In many problems, the Levenberg–Marquardt (LM) algorithm outperforms standard gradient descent and many other conjugate gradient methods. LM is a combination of the local search features of Gauss–Newton and the error reduction consistency afforded by the gradient descent algorithm. The feedforward network training based on LM is considered an unconstrained optimization issue. The Levenberg–Marquardt algorithm was developed to approximate the training speed of second-order methods without explicitly computing the Hessian matrix. In the case where the performance function of feedforward networks can be expressed as a sum of squares, the Hessian matrix can be estimated using the following approximation¹⁰¹:

$$ H = J^{{\text{T}}} J $$

(9)

The calculation of the gradient can be expressed in the following manner:

$$ g = J^{T} e $$

(10)

In this context, the Jacobian matrix denoted as J comprises the first derivatives of network errors concerning weights and biases, while e represents the vector of network errors. The Jacobian matrix can be obtained using a standard back-propagation technique, which is notably less intricate than the computation of the Hessian matrix. The Levenberg–Marquardt algorithm utilizes this approximation of the Hessian matrix in the subsequent Newton-like update iteration⁹⁸.

$$ x_{k + 1} = x_{k} - [J^{T} J + \mu I]^{ - 1} J^{T} e $$

(11)

When the value of scalar μ equals zero, it corresponds to employing an approximation of the Hessian matrix in Newton's method. This results in a gradient descent with shorter steps when μ is large. Newton's method exhibits faster and more accurate convergence near an error minimum; thus, the goal is to transition to Newton's method as early as possible. Accordingly, μ is decreased after each successful step (improvement in the performance function) and only increased when a tentative step leads to an increase in the performance function. The performance function consistently decreases with each iteration in the algorithm by following this approach⁹⁸.

Bayesian regularization backpropagation (BR)

The conventional Backpropagation (BP) algorithm can encounter the problem of overfitting, which manifests as reduced bias and increased variance. Conversely, the Bayesian regularization of artificial neural networks (BRANN) exhibits superior generalization capabilities. The BRANN minimizes the objective function F, which combines the mean squared error function E_D and the weight attenuation function E_W. It probabilistically determines the optimal weights and parameters of the objective function. The objective function of the BRANN can be represented as¹⁰²:

$$ F = \beta E_{D} + \alpha E_{W} $$

(12)

$$ E_{D} = \frac{1}{N}\sum\limits_{i}^{N} {(y_{i} - t_{i} )^{2} } = \frac{1}{N}\sum\limits_{i}^{N} {e_{i}^{2} } $$

(13)

$$ E_{W} = \frac{1}{2}\sum\limits_{i}^{m} {w_{i}^{2} } $$

(14)

In the given equation, α and β are hyper-parameters utilized to control the distribution of other parameters. w represents the weights, while m denotes the number of these weights. D refers to the training set data, represented as (x_i, t_i), where i ranges from 1 to N, indicating the total number of input–output pairs in the training set. y_i represents the output value corresponding to the i-th input–output pair in the training set¹⁰². The ANN model should produce nearly identical error rates for training and test data. Regularization is a technique that forces a neural network to converge to a set of weights and biases with reduced values. This makes the network's response more consistent and reduces the likelihood of data overfitting.

Development of optimal ANN structure

The process of constructing MLP and RBF neural networks is demonstrated in a step-by-step manner in Figs. 2 and 3, respectively. This process typically involves determining the network's architecture, training the network, and evaluating its performance. A trial-and-error methodology was employed to identify the optimal structure for the artificial neural network (ANN)⁶⁵. The optimal ANN structure was subsequently determined based on the highest value of the low value of mean square error (MSE) (Eq. 15), Pearson's linear correlation coefficient (R) (Eq. 16), and the average absolute relative deviation percentage (AARD%) (Eq. 17)¹⁰³.

$$ MSE = \frac{1}{N}\sum\limits_{i = 1}^{N} {(\alpha_{\exp } - \alpha_{cal} )}^{2} $$

(15)

$$ R = \frac{{\left[ {\sum\limits_{i = 1}^{N} {\left( {\alpha_{{\exp - \overline{{\alpha_{\exp } }} }} } \right)\left( {\alpha_{{cal - \overline{{\alpha_{cal} }} }} } \right)} } \right]}}{{\left[ {\sum\limits_{i = 1}^{N} {\left( {\alpha_{{\exp - \overline{{\alpha_{\exp } }} }} } \right)\sum\limits_{i = 1}^{N} {\left( {\alpha_{{cal - \overline{{\alpha_{cal} }} }} } \right)} } } \right]^{2} }} $$

(16)

$$ AARD\% = \frac{100}{N}\sum\nolimits_{i = 1}^{N} {\left| {\frac{{\alpha_{\exp } - \alpha_{cal} }}{{\alpha_{cal} }}} \right|} $$

(17)

Here α_exp represents the data points obtained from experimental measurements, while α_cal represents the corresponding data points calculated by the models. N denotes the total number of data points.

In prior studies^103,104, researchers initially examined a predetermined network topology and explored several training algorithms on that specific network structure to identify the most optimum network configuration. After identifying the most effective learning algorithm, the researchers proceeded to determine the optimal network configuration by manipulating the number of neurons and layers inside the network. This would result in the loss of an essential part of the search space, namely the existence of a network with better performance but a different structure compared to the predetermined network structure for the trained network with a determined training algorithm. Notably, this approach highlights the importance of considering a broader range of network architectures to potentially discover superior configurations that may have been missed in the initial exploration.

In this study, the objective of the proposed method was to highlight the impact of the learning algorithm in determining the optimal network configuration. To achieve this, the development of optimal models for each training algorithm was initiated. Subsequently, an assessment was carried out to determine the most suitable network architecture and the optimal selection of activation functions for each model. This sequential approach enabled the systematic exploration of the impact of training algorithms on the Recognition of ideal network configurations within our specific domain of study. Thirteen different backpropagation training algorithms have been applied to train MLP neural networks. With the desire to determine the optimal neural network architecture for each training algorithm, two concealed layers of neurons ranging from zero to 50 are considered. Comparing four distinct combinations of logsig and tansig functions in the hidden layers of each algorithm's optimal architecture led to the detection of suitable activation functions. The initial assignment of weights and biases was randomly performed using MATLAB software. It is important to acknowledge that, to mitigate the impact of initial weight and bias assumptions on the outcomes, each MLP topology was executed at least three times, with only the most optimal result considered. This approach was employed to propose a model that exhibits enhanced and more precise performance, accounting for variations in the training process.

Similar to the MLP neural network, there are no specific rules for determining the optimal architecture of the RBF network. In the case of the RBF network, the number of hidden layers remains constant at one. The sole parameter of the network structure that requires determination is the number of neurons in the hidden layer. This parameter is established through a process of trial and error. In our research, we employ a Gaussian function with a radial basis activation function, and the spread value is dependent on the desired Gaussian function.

Results and discussion

Best MLP model

Several structures of MLP were investigated for each training algorithm to determine the optimal ANN for predicting CO₂ adsorption capacity. Figure 4 displays the best Mean Square Error (MSE) value achieved for each topology. The analysis of Fig. 4 suggests that a more complex neural network architecture with multiple hidden layers and specific neuron counts in those layers is necessary to achieve the best accuracy for the task at hand. This conclusion is drawn from the observed MSE values in the figure, which indicate that these configurations perform better than single-layer networks. To enhance the accuracy of models, for each network's training algorithm, the optimal ANN structure (determined by the lowest MSE) was utilized to evaluate four different combinations of activation functions. The outcomes of these evaluations are presented in Table 3. With the tansig activation function for the first layer and logsig for the second layer, the lowest MSE values for the BFG, RP, and CGB training algorithms are 9.01E−05, 4.78E−05, and 6.81E−05, respectively. The minimum MSE value of 9.23E−05 has been attained for the CGF training algorithm when the activation functions of the first and second layers are both logsig. While the activation functions of the first and second layers of other training algorithms are both tansig, the MSE has reached its minimum value. After obtaining the characteristics of the optimal networks for each training algorithm, these networks are applied to the prepared data and compared with each other. The outcomes of implementing networks with various algorithms and optimal architectures are listed in Table 4. For each of the applied networks, the MSE and correlation coefficient (R) for the training, validation, test, and total data are displayed.

Table 3 The outcomes of operating neural networks with various activation function combinations.

Full size table

Table 4 The results of implementing networks with diverse algorithms and optimal architectures.

Full size table

The LM training algorithm exhibits the highest level of accuracy among the considered training algorithms, showcasing an impressive MSE (mean squared error) of 2.62932E−05 across all datasets. This high accuracy extends to the training dataset with an MSE of 1.9098E−05, the validation dataset with 3.6342E−05, and even the test dataset with 7.3798E−05. In stark contrast, the GDM (gradient descent with momentum) algorithm performs less accurately, registering the lowest accuracy levels among the algorithms under examination. The superior performance of the LM algorithm in terms of accuracy can be attributed to its adaptability, use of second-order information, and efficient optimization in complex landscapes. Conversely, the GDM algorithm's lower accuracy may result from its reliance on fixed learning rates, sensitivity to initialization, and a greater tendency to get stuck in local minima.

The SCG algorithm, with a remarkably short runtime of 0.7640 s, stands out as the most time-efficient method for training neural networks in this study. In contrast, the BFG algorithm exhibits the longest training time, consuming 56.6110 s to complete the network training process. This substantial difference in runtime highlights the significant disparity in computational efficiency between these two optimization algorithms. the difference in training time between SCG and BFG likely arises from a combination of algorithmic differences, problem-specific factors, and the chosen settings or hyperparameters.

The LM training algorithm demonstrates efficient convergence within a relatively small number of epochs, specifically, 82 epochs. In contrast, both the GDM and GD training algorithms have reached the predefined maximum number of epochs, set at 5000 epochs, without achieving the desired convergence. This disparity in the number of epochs required for convergence underscores the distinct convergence behaviors of these algorithms. The LM algorithm's ability to achieve convergence within a limited number of epochs suggests its effectiveness in optimizing neural networks, while the protracted training process observed in GDM and GD may indicate challenges in navigating the optimization landscape.

Figure 5 presents a comprehensive comparison of neural networks trained with various training algorithms, considering performance accuracy, run time, and the number of epochs. This comparison provides valuable insights into the trade-offs between these critical aspects of algorithm performance.

As previously stated, the ANN trained with the LM backpropagation algorithm was chosen as the most effective training method due to its low mean square error (MSE < 2.6293E−05) and high correlation coefficient (R > 0.9951). Therefore, it is chosen as the optimal training algorithm to build the MLP neural network model for simulating and predicting carbon dioxide adsorption. Figure 6 depicts the structure of the optimal MLP network obtained.

Figure 7 depicts the variation in mean square error as a function of the number of data application steps, where the optimal MLP model displays the best validation performance (3.6342E−05) at 72 epochs. In addition, the error histogram illustrates the operation of the neural network in Fig. 8. By comparing the collected experimental data with the data modeled by the MLP neural network in Fig. 9, it is evident that the experimental data and predicted data are highly congruent. In addition, experimental values are always associated with some error, necessitating the use of data with less error for a more accurate network. However, the R correlation coefficients for network training, validation, test, and total, were obtained as 0.99614, 0.99441, 0.99142, and 0.99512, emphasizing the reliability and accuracy of the chosen neural network model. This demonstrates the consistency model's performance across different data subsets and reinforces the robustness of the findings in this study. Therefore, neural networks are appropriate for modeling the CO₂ adsorption on carbon-based adsorbents.

Best RBF model

As previously emphasized, it is essential to ascertain the optimal value for the spread parameter within the radial basis function for the RBF neural network. As evidenced by the data presented in Table 5, it becomes apparent that a spread value of 10 corresponds to the lowest observed mean squared error (MSE). The hidden layer encompasses a notable 302 neurons, signifying a relatively large quantity compared to other network configurations. However, it is noteworthy that this larger neuron count does not yield a substantially different mean squared error (MSE) value. Conversely, in the network with a spread parameter set at 9, a slight increment in MSE is observed. Nevertheless, this configuration is accompanied by a reduced number of neurons in the hidden layer, totaling 207. This reduction not only leads to diminished computational time but also translates into lower computational costs. Furthermore, it is worth noting that this particular network does not feature an excessively large or excessively small number of neurons within its hidden layer in comparison to alternative configurations. Additionally, its performance accuracy is notably high, making it the preferred choice as the optimal model for the RBF network. This selection is visually represented in Fig. 10, where the chosen RBF network configuration is depicted. The change in mean square error is displayed in Fig. 11, in which the optimal RBF model with 207 neurons in the hidden layer exhibits the best performance (9.8402E−05). In addition, a moderately good agreement between the RBF output values and the experimental data is observable in the regression diagram of Fig. 12, with the value of R equal to 0.98145.

Table 5 The MSE values for the spread range of 3 to 12 in the RBF network.

Full size table

Prediction of CO₂ adsorption with new data

In order to evaluate the efficacy of the created neural network models, the obtained MLP and RBF models are performed with 45 new data (which were initially separated from the data set), and the predicted CO₂ adsorbed is compared to the experimental values. The results of CO₂ adsorption prediction by MLP neural network models with various training algorithms are displayed in Table 6. The LM algorithm demonstrates the highest accuracy among all models, evidenced by the lowest AARD% value of 2.80 and the highest correlation coefficient of 0.9993. Additionally, the BR algorithm yields commendable results, with an AARD% of 4.27 and a correlation coefficient of 0.9988. The outcomes of RBF neural network models with varying spread values in predicting the quantity of CO₂ adsorption are presented in Table 7. It is evident that the model achieving the lowest AARD% value, standing at 13.41% and associated with a dispersion of 9, attains the highest level of accuracy among all the models. For visual representation, Fig. 13 illustrates the linear regression between the predicted values for CO₂ adsorption and the neural network outputs, considering both MLP and RBF models, using new data.

Table 6 Prediction of CO₂ uptake by MLP neural network models with distinct training algorithms.

Full size table

Table 7 Prediction of CO₂ uptake by RBF neural network models with distinct training algorithms.

Full size table

Comparing MLP and RBF

Modeling and simulation of carbon dioxide adsorption on carbon base adsorbents by neural networks revealed that the MLP network (with the LM training algorithm) complies with experimental values more closely than RBF. Table 8 provides the MSE value and correlation coefficient derived from simulation and prediction with new data for both networks. The MLP deep neural network is more appropriate for modeling and simulating this process than the RBF network due to its higher correlation coefficient and lower mean square error values. As previously stated, the relation of Durá et al.⁴⁵ is presented to predict the quantity of carbon dioxide adsorbed by micropore and mesopore volume. This model predicts the amount of carbon dioxide adsorbed on 12 distinct adsorbers using a square correlation coefficient of 0.9829. With a correlation coefficient of 0.9951 for more than 200 adsorbers at varying temperatures and pressures, it is evident that the MLP deep network model obtained through this study is more accurate and efficient.

Table 8 Comparing the performance of various models for the CO₂ adsorption process.

Full size table

Evaluation of adsorption factors

Figure 14 exhibits a three-dimensional graphical representation depicting the relationship between carbon dioxide adsorption, temperature, and pressure. This depiction assumes that the volumes of mesopores, micropores, and the BET surface remain constant, set at values of 0.75, 0.53, and 1510, respectively. The graph illustrates a notable upward trend in adsorption with increasing pressure, aligning with findings observed in pertinent studies^24,105. Conversely, with an increase in temperature to 120 °C, a slight reduction in the adsorption becomes apparent. This decrease can be attributed to the exothermic nature of the adsorption process, whereby the concentration of adsorbed gas on the adsorber's surface diminishes as temperature levels rise^24,40. According to the data presented in Fig. 14, the highest levels of CO₂ adsorption are observed within the pressure range of 30–50 and the temperature range of 0–20.

The carbon dioxide adsorption characteristics at 25 °C, a BET surface area of 500 square meters per gram, and pressures of 1, 5, 15, and 20 bar are presented in Fig. 15, with a focus on the role of mesopores and micropores. At 1 bar pressure, the influence of micropore volume in the range of 0.6–1.2 cm³/g on carbon dioxide adsorption is predominantly observed in Fig. 15a. This trend is sustained up to 5 bar, where a significant role for micropores is depicted in Fig. 15b. However, as pressures increase to 15 and 20 bars, the prominence of mesopore volume becomes more evident, as observed in Fig. 15c,d. Particularly at 20 bar, substantial growth in the quantity of adsorption within the mesopore volume range of 4–8 cm³/g, is exhibited. These findings are aligned with prior research^37,40,42,43, which suggests that carbon dioxide adsorption is primarily governed by micropore volume at lower pressures and mesopore volume at higher pressures. This shift may be attributed to the saturation of micropores at higher pressures, necessitating the contribution of mesopores to achieve higher CO₂ uptake⁴³.

Figure 16 presents the depiction of carbon dioxide adsorption onto the adsorbent, examining its dependency on BET surface area, temperature, and pressure. Figure 16a elucidates the outcomes under a fixed pressure condition of 5 bar while concurrently noting micropore and mesopore volumes of 0.53 and 0.75, respectively. Within this parameter range, the study observed that the maximum CO₂ adsorption occurred at lower temperatures ranging from 0 to 60 °C, along with a higher BET surface area ranging from 2000 to over 3500 m³/g. Generally, a diminishing trend in carbon dioxide adsorption was noted as temperature increased. Conversely, as the BET surface area approached 2000, a notable increase in adsorption was recorded, followed by a modest decline, although it remained elevated.

In Fig. 16b, the findings are presented at a consistent temperature of 25 °Celsius, with micropore and mesopore volumes set at 0.75 and 4.5, respectively. Within this context, the study identified the peak adsorption occurring within a specified pressure range of 30–50 bars, in conjunction with a BET surface area ranging from 1000 to 2500 m³/g. Overall, it was observed that carbon dioxide adsorption exhibited a positive correlation with both increasing pressure and BET surface area. Nevertheless, it is important to note that the observed increase in surface area did not consistently result in a simultaneous increase in adsorption across all pressure levels. This observation suggests that a substantial specific surface area indeed enhances the adsorption capacity of CO₂ but within a specific range of CO₂ pressures³⁷.

Figure 17 delineates the influence of BET surface area, mesopore volume, and micropore volume on CO₂ adsorption. Within Fig. 17a, a discernible trend emerges, wherein carbon dioxide adsorption exhibits an ascending pattern in response to elevated BET surface area and mesopore volume values. This behavior is observed under specific conditions, including a micropore volume of 0.45 cm³/g, a temperature of 25 °C, and a pressure of 5 bar. Notably, the zenith of carbon dioxide adsorption manifests within a designated range, observed within 1 to 7 cm³/g for mesopore volume and 2500–3000 m²/g for BET surface area.

In Fig. 17b, conducted at 25 °C and 15 bar, with a mesopore volume of cm³/g, the paramount point of carbon dioxide adsorption is situated within the domain defined by micropore volume values ranging from 0.4 to 0.8 and BET surface area values spanning from 3000 to 3700 m²/g. Furthermore, a substantial quantity of adsorption is discernible within the span characterized by micropore volume from 1 to 1.4 and BET surface area from 1500 to 3500 m²/g. These findings signify the distinct influence exerted by micropore volume at lower BET surface area values and the accentuated impact of BET surface area at reduced micropore volumes.

It is imperative to underscore the formidable challenge posed by the synthesis of porous carbon materials concurrently possessing high BET surface areas (indicative of substantial micropore volume) and low micropore volumes (characterized by extensive BET surface areas), as noted in previous research⁴⁴.

Conclusion

This study successfully modeled carbon dioxide adsorption on carbon-based adsorbents using multilayer perceptron (MLP) and radial basis function (RBF) neural networks. Input variables such as BET surface, mesopore volume, micropore volume, temperature, and pressure were used in the models. After evaluating various training algorithms and activation functions, the Levenberg–Marquardt backpropagation algorithm with 'tansig' activation in hidden layers and linear output was identified as the optimal configuration for MLP models. The best MLP and RBF models achieved mean square error (MSE) values of 2.6293E−5 and 9.8401E−5, respectively. The MLP deep neural network with LM and BR training algorithms outperformed the RBF network, achieving a remarkable correlation coefficient of 0.9951 across a dataset of over 200 adsorbers. This study also revealed the significant influence of micropore volume at lower pressures and mesopore volume at higher pressures on CO₂ uptake. The study has significantly contributed to the development of a comprehensive and efficient model for predicting carbon dioxide adsorption, leveraging prior research to establish a robust connection between the textural properties of adsorbents and operational conditions. This advancement enhances the ability to predict porous carbon CO₂ uptake effectively.

Data availability

The datasets used and analysed during the current study available from the corresponding author on reasonable request.

Abbreviations

$M_{ads}^{{CO_{2} }}$ :: The amount of CO₂ adsorbed
$V_{micro}$ :: Micropore volume
$V_{meso}$ :: Mesopore volume
$w^{i}$ :: Set of variables that need to be updated
$\nabla f(w^{i} )$ :: Gradient of the loss function f
$y^{0}$ :: Primary training direction vector
$c^{i}$ :: Conjugate parameter
$H$ :: Hessian matrix estimation
$J$ :: Jacobian matrix
$e$ :: Vector of network errors
$g$ :: Gradient
$F$ :: Objective function
$E_{D}$ :: Mean squared error function
$E_{W}$ :: Weight attenuation function
$\eta^{i}$ :: Learning rate
$\mu$ :: Scalar
$\alpha$ :: Hyper-parameters utilized to control the distribution of other parameters
$\beta$ :: Hyper-parameters utilized to control the distribution of other parameters
$\alpha_{\exp }$ :: Real experimental data points
$\alpha_{cal}$ :: Calculated data points
DL:: Deep learning
ANN:: Artificial neural network
BP:: Backpropagation
GD:: Gradient descent
GDM:: Gradient descent with momentum backpropagation
GDA:: Gradient descent with adaptive learning rate backpropagation
GDX:: Gradient descent with momentum and adaptive learning rate backpropagation
RP:: Resilient backpropagation
CGF:: Fletcher–Powell conjugate gradient
CGP:: Polak–Ribiere conjugate gradient
CGB:: Conjugate gradient with Powell–Beal restarts
SCG:: Scaled conjugate gradient
LM:: Levenberg–Marquardt
BFGS:: Broyden–Fletcher–Goldfarb–Shanno algorithm
OSS:: One step secant
BR:: Bayesian regularization
MLP:: Multi layer percepteron
RBF:: Radial basis function
MSE:: Mean square error
AARD%:: Average absolute relative deviation percentage
Activation function:: The activation function serves as a mathematical function situated between the input being received by the present neuron and the output transmitted to the subsequent layer.
Bias:: Bias is a constant that aids the model in optimizing its fit to the provided data.
Neurons:: Neurons serve as fundamental units within a complex neural network.
Epoch:: Training involves inputs and outputs being fed into iterative steps, compared to target values for error calculation. Weights and biases are computed and adjusted at each epoch.
Weight:: Describes the significance and strengths of the input to the Neurons

References

Ahmed, R. et al. Recent advances in carbon-based renewable adsorbent for selective carbon dioxide capture and separation—a review. J. Clean. Prod. 242, 118409 (2020).
Article CAS Google Scholar
Ochedi, F. O., Liu, Y. & Adewuyi, Y. G. State-of-the-art review on capture of CO2 using adsorbents prepared from waste materials. Process Saf. Environ. Prot. 139, 1–25 (2020).
Article CAS Google Scholar
Rashidi, N. A. & Yusup, S. An overview of activated carbons utilization for the post-combustion carbon dioxide capture. J CO2 Util. 13, 1–16 (2016).
Article CAS Google Scholar
Zhang, X.-Q., Li, W.-C. & Lu, A.-H. Designed porous carbon materials for efficient CO2 adsorption and separation. New Carbon Mater. 30(6), 481–501 (2015).
Article CAS Google Scholar
Hussain, M. et al. Regional and sectoral assessment on climate-change in Pakistan: Social norms and indigenous perceptions on climate-change adaptation and mitigation in relation to global context. J. Clean. Prod. 200, 791–808 (2018).
Article Google Scholar
Rochelle, G. T. Amine scrubbing for CO2 capture. Science 325(5948), 1652–1654 (2009).
Article ADS PubMed CAS Google Scholar
Babar, M. et al. Development of a novel switched packed bed process for cryogenic CO2 capture from natural gas. Process Saf. Environ. Prot. 147, 878–887 (2021).
Article CAS Google Scholar
Boetcher, S. K. et al. Direct atmospheric cryogenic carbon capture in cold climates. Carbon Capture Sci. Technol. 2, 100127 (2023).
Article Google Scholar
Maqsood, K. et al. Experimental and simulation study on high-pressure VLS cryogenic hybrid network for CO2 capture from highly sour natural gas. Process Saf. Environ. Prot. 150, 36–50 (2021).
Article CAS Google Scholar
Shen, M. et al. Cryogenic technology progress for CO2 capture under carbon neutrality goals: A review. Sep. Purif. Technol. 25, 121734 (2022).
Article Google Scholar
He, G. et al. High-permeance polymer-functionalized single-layer graphene membranes that surpass the postcombustion carbon capture target. Energy Environ. Sci. 12(11), 3305–3312 (2019).
Article CAS Google Scholar
Jiang, X. et al. Penetrating chains mimicking plant root branching to build mechanically robust, ultra-stable CO 2-philic membranes for superior carbon capture. J. Mater. Chem. A 7(28), 16704–16711 (2019).
Article CAS Google Scholar
Li, H. et al. Ultra-selective membrane composed of charge-stabilized fixed carrier and amino acid-based ionic liquid mobile carrier for highly efficient carbon capture. Chem. Eng. J. 453, 139780 (2023).
Article ADS CAS Google Scholar
Wu, W. et al. Polydopamine-modified metal–organic framework membrane with enhanced selectivity for carbon capture. Environ. Sci. Technol. 53(7), 3764–3772 (2019).
Article ADS PubMed CAS Google Scholar
Karami, B. & Ghaemi, A. Cost-effective nanoporous hypercross-linked polymers could drastically promote the CO2 absorption rate in amine-based solvents, improving energy-efficient CO2 capture. Ind. Eng. Chem. Res. 60(7), 3105–3114 (2021).
Article CAS Google Scholar
Karami, B., Ghaemi, A. & Shahhosseini, S. Eco-friendly deep eutectic solvents blended with diethanolamine solution for postcombustion CO2 capture. Energy Fuels 36(2), 945–957 (2022).
Article CAS Google Scholar
Sistla, Y. S. & Khanna, A. CO2 absorption studies in amino acid-anion based ionic liquids. Chem. Eng. J. 273, 268–276 (2015).
Article CAS Google Scholar
Yan, H. et al. Superbase ionic liquid-based deep eutectic solvents for improving CO2 absorption. ACS Sustain. Chem. Eng. 8(6), 2523–2530 (2020).
Article CAS Google Scholar
Bai, J. et al. One-pot synthesis of self S-doped porous carbon for efficient CO2 adsorption. Fuel Process. Technol. 244, 107700 (2023).
Article CAS Google Scholar
Pham, T. D. et al. Molecular basis for the high CO2 adsorption capacity of chabazite zeolites. ChemSusChem 7(11), 3031–3038 (2014).
Article PubMed CAS Google Scholar
Ramezanipour Penchah, H., Ghaemi, A. & Jafari, F. Piperazine-modified activated carbon as a novel adsorbent for CO 2 capture: Modeling and characterization. Environ. Sci. Pollut. Res. 25, 1–10 (2021).
Google Scholar
Xue, D.-X. et al. Tunable rare-earth fcu-MOFs: A platform for systematic enhancement of CO2 adsorption energetics and uptake. J. Am. Chem. Soc. 135(20), 7660–7667 (2013).
Article PubMed CAS Google Scholar
Naquash, A. et al. State-of-the-art assessment of cryogenic technologies for biogas upgrading: Energy, economic, and environmental perspectives. Renew. Sustain. Energy Rev. 154, 111826 (2022).
Article CAS Google Scholar
Jin, C. et al. Sawdust wastes-derived porous carbons for CO2 adsorption Part 1 Optimization preparation via orthogonal experiment. Sep. Purif. Technol. 276, 119270 (2021).
Article CAS Google Scholar
Leung, D. Y., Caramanna, G. & Maroto-Valer, M. M. An overview of current status of carbon dioxide capture and storage technologies. Renew. Sustain. Energy Rev. 39, 426–443 (2014).
Article CAS Google Scholar
Canevesi, R. L. et al. Pressure swing adsorption for biogas upgrading with carbon molecular sieve. Ind. Eng. Chem. Res. 57(23), 8057–8067 (2018).
Article CAS Google Scholar
Karimi, M. et al. Carbon dioxide separation and capture by adsorption: A review. Environ. Chem. Lett. 25, 1–44 (2023).
Google Scholar
Lu, T. et al. Synthesis of potassium Bitartrate-derived porous carbon via a facile and self-activating strategy for CO2 adsorption application. Sep. Purif. Technol. 296, 121368 (2022).
Article CAS Google Scholar
Xiao, J. et al. N, S-containing polycondensate-derived porous carbon materials for superior CO2 adsorption and supercapacitor. Appl. Surf. Sci. 562, 150128 (2021).
Article CAS Google Scholar
Zhang, Z. et al. Rational design of tailored porous carbon-based materials for CO 2 capture. J. Mater. Chem. A 7(37), 20985–21003 (2019).
Article CAS Google Scholar
Grand, J. et al. Flexible template-free RHO nanosized zeolite for selective CO2 adsorption. Chem. Mater. 32(14), 5985–5993 (2020).
Article CAS Google Scholar
Madhu, J. et al. Comparison of three different structures of zeolites prepared by template-free hydrothermal method and its CO2 adsorption properties. Environ. Res. 214, 113949 (2022).
Article PubMed CAS Google Scholar
Megías-Sayago, C. et al. CO2 adsorption capacities in zeolites and layered double hydroxide materials. Front. Chem. 7, 551 (2019).
Article ADS PubMed PubMed Central Google Scholar
Ghanbari, T., Abnisa, F. & Daud, W. M. A. W. A review on production of metal organic frameworks (MOF) for CO2 adsorption. Sci. Total Environ. 707, 135090 (2020).
Article ADS PubMed CAS Google Scholar
Qasem, N. A., Ben-Mansour, R. & Habib, M. A. An efficient CO2 adsorptive storage using MOF-5 and MOF-177. Appl. Energy 210, 317–326 (2018).
Article CAS Google Scholar
Salehi, S. & Anbia, M. High CO2 adsorption capacity and CO2/CH4 selectivity by nanocomposites of MOF-199. Energy Fuels 31(5), 5376–5384 (2017).
Article CAS Google Scholar
Choi, S., Drese, J. H. & Jones, C. W. Adsorbent materials for carbon dioxide capture from large anthropogenic point sources. ChemSusChem 2(9), 796–854 (2009).
Article PubMed CAS Google Scholar
Sayari, A., Belmabkhout, Y. & Serna-Guerrero, R. Flue gas treatment via CO2 adsorption. Chem. Eng. J. 171(3), 760–774 (2011).
Article CAS Google Scholar
Karimi, M. et al. MIL-160 (Al) as a candidate for biogas upgrading and CO2 capture by adsorption processes. Ind. Eng. Chem. Res. 62(12), 5216–5229 (2023).
Article CAS Google Scholar
Hong, S.-M. et al. Porous carbon based on polyvinylidene fluoride: Enhancement of CO2 adsorption by physical activation. Carbon 99, 354–360 (2016).
Article CAS Google Scholar
Meng, F. et al. Study on a nitrogen-doped porous carbon from oil sludge for CO2 adsorption. Fuel 251, 562–571 (2019).
Article CAS Google Scholar
Deng, S. et al. Superior CO2 adsorption on pine nut shell-derived activated carbons and the effective micropores at different temperatures. Chem. Eng. J. 253, 46–54 (2014).
Article CAS Google Scholar
Sevilla, M. et al. Optimization of the pore structure of biomass-based carbons in relation to their use for CO2 capture under low-and high-pressure regimes. ACS Appl. Mater. Interfaces 10(2), 1623–1633 (2018).
Article PubMed CAS Google Scholar
Zhang, Z. et al. Prediction of carbon dioxide adsorption via deep learning. Angew. Chemie 131(1), 265–269 (2019).
Article ADS Google Scholar
Durá, G. et al. Importance of micropore-mesopore interfaces in carbon dioxide capture by carbon-based materials. Angew. Chemie 128(32), 9319–9323 (2016).
Article ADS Google Scholar
Sinha, S. K. & Wang, M. C. Artificial neural network prediction models for soil compaction and permeability. Geotech. Geol. Eng. 26, 47–64 (2008).
Article Google Scholar
Shen, W. et al. Hierarchical porous polyacrylonitrile-based activated carbon fibers for CO 2 capture. J. Mater. Chem. 21(36), 14036–14040 (2011).
Article CAS Google Scholar
Xia, Y. et al. Superior CO2 adsorption capacity on N-doped, high-surface-area, microporous carbons templated from zeolite. Adv. Energy Mater. 1(4), 678–683 (2011).
Article MathSciNet CAS Google Scholar
Casco, M. E. et al. Effect of the porous structure in carbon materials for CO2 capture at atmospheric and high-pressure. Carbon 67, 230–235 (2014).
Article CAS Google Scholar
Coli, G. M. et al. Inverse design of soft materials via a deep learning–based evolutionary strategy. Sci. Adv. 8(3), 25 (2022).
Article Google Scholar
Dong, Y. et al. Inverse design of two-dimensional graphene/h-BN hybrids by a regressional and conditional GAN. Carbon 169, 9–16 (2020).
Article CAS Google Scholar
Dong, Y. et al. Bandgap prediction by deep learning in configurationally hybridized graphene and boron nitride. NPJ Comput. Mater. 5(1), 26 (2019).
Article ADS Google Scholar
Kolbadinejad, S. et al. Deep learning analysis of Ar, Xe, Kr, and O2 adsorption on activated carbon and zeolites using ANN approach. Chem. Eng. Process. Process Intensif. 170, 108662 (2022).
Article CAS Google Scholar
Dashti, A. et al. Rigorous prognostication and modeling of gas adsorption on activated carbon and Zeolite-5A. J. Environ. Manage. 224, 58–68 (2018).
Article PubMed CAS Google Scholar
Fotoohi, F. et al. Predicting pure and binary gas adsorption on activated carbon with two-dimensional cubic equations of state (2-D EOSs) and artificial neural network (ANN) method. Phys. Chem. Liq. 54(3), 281–302 (2016).
Article CAS Google Scholar
Iraji, N. et al. Adsorption of CO2 and SO2 on multi-walled carbon nanotubes: Experimental data and modeling using artificial neural network. J. Particle Sci. Technol. 5(1), 33–45 (2019).
CAS Google Scholar
Leperi, K. T. et al. 110th anniversary: Surrogate models based on artificial neural networks to simulate and optimize pressure swing adsorption cycles for CO2 capture. Ind. Eng. Chem. Res. 58(39), 18241–18252 (2019).
Article CAS Google Scholar
Meng, M. et al. Adsorption characteristics of supercritical CO2/CH4 on different types of coal and a machine learning approach. Chem. Eng. J. 368, 847–864 (2019).
Article CAS Google Scholar
Rostami, A. et al. Accurate estimation of CO2 adsorption on activated carbon with multi-layer feed-forward neural network (MLFNN) algorithm. Egypt. J. Pet. 27(1), 65–73 (2018).
Article MathSciNet Google Scholar
Kareem, F. A. A. et al. Experimental measurements and modeling of supercritical CO2 adsorption on 13X and 5A zeolites. J. Nat. Gas Sci. Eng. 50, 115–127 (2018).
Article Google Scholar
Mashhadimoslem, H. et al. Development of predictive models for activated carbon synthesis from different biomass for CO2 adsorption using artificial neural networks. Ind. Eng. Chem. Res. 60(38), 13950–13966 (2021).
Article CAS Google Scholar
Khoshraftar, Z. & Ghaemi, A. Preparation of activated carbon from Entada Africana Guill & Perr for co2 capture: Artificial neural network and isotherm modeling. J. Chem. Pet. Eng. 56(1), 165–180 (2022).
CAS Google Scholar
Barki, H., Khaouane, L. & Hanini, S. Modelling of adsorption of methane, nitrogen, carbon dioxide, their binary mixtures, and their ternary mixture on activated carbons using artificial neural network. Kemija Ind. Časopis Kemičara Kemijskih Inženjera Hrvatske 68(7–8), 289–302 (2019).
CAS Google Scholar
Torkashvand, A., Ramezanipour Penchah, H. & Ghaemi, A. Exploring of CO2 adsorption behavior by Carbazole-based hypercrosslinked polymeric adsorbent using deep learning and response surface methodology. Int. J. Environ. Sci. Technol. 19(9), 8835–8856 (2022).
Article CAS Google Scholar
Moradi, M. R., RamezanipourPenchah, H. & Ghaemi, A. CO2 capture by benzene-based hypercrosslinked polymer adsorbent: Artificial neural network and response surface methodology. Can. J. Chem. Eng. 20, 20 (2023).
Google Scholar
Sevilla, M. & Fuertes, A. B. CO2 adsorption by activated templated carbons. J. Colloid Interface Sci. 366(1), 147–154 (2012).
Article ADS PubMed CAS Google Scholar
Hao, G. P. et al. Rapid synthesis of nitrogen-doped porous carbon monolith for CO2 capture. Adv. Mater. 22(7), 853–857 (2010).
Article PubMed CAS Google Scholar
Travis, W., Gadipelli, S. & Guo, Z. Superior CO 2 adsorption from waste coffee ground derived carbons. RSC Adv. 5(37), 29558–29562 (2015).
Article ADS CAS Google Scholar
Lee, S.-Y. & Park, S.-J. Determination of the optimal pore size for improved CO2 adsorption in activated carbon fibers. J. Colloid Interface Sci. 389(1), 230–235 (2013).
Article ADS PubMed CAS Google Scholar
Sevilla, M. & Fuertes, A. B. Sustainable porous carbons with a superior performance for CO 2 capture. Energy Environ. Sci. 4(5), 1765–1771 (2011).
Article CAS Google Scholar
Wahby, A. et al. High-surface-area carbon molecular sieves for selective CO2 adsorption. ChemSusChem 3(8), 974–981 (2010).
Article PubMed CAS Google Scholar
Ludwinowicz, J. & Jaroniec, M. Potassium salt-assisted synthesis of highly microporous carbon spheres for CO2 adsorption. Carbon 82, 297–303 (2015).
Article CAS Google Scholar
Adeniran, B. & Mokaya, R. Low temperature synthesized carbon nanotube superstructures with superior CO 2 and hydrogen storage capacity. J. Mater. Chem. A 3(9), 5148–5161 (2015).
Article CAS Google Scholar
Parshetti, G. K., Chowdhury, S. & Balasubramanian, R. Biomass derived low-cost microporous adsorbents for efficient CO2 capture. Fuel 148, 246–254 (2015).
Article CAS Google Scholar
Estevez, L. et al. Hierarchically porous carbon materials for CO2 capture: The role of pore structure. Ind. Eng. Chem. Res. 57(4), 1262–1268 (2018).
Article CAS Google Scholar
Srinivas, G. et al. Exceptional CO 2 capture in a hierarchically porous carbon with simultaneous high surface area and pore volume. Energy Environ. Sci. 7(1), 335–342 (2014).
Article CAS Google Scholar
Singh, G. et al. Single step synthesis of activated bio-carbons with a high surface area and their excellent CO2 adsorption capacity. Carbon 116, 448–455 (2017).
Article CAS Google Scholar
Hao, G.-P. et al. Structurally designed synthesis of mechanically stable poly (benzoxazine-co-resol)-based porous carbon monoliths and their application as high-performance CO2 capture sorbents. J. Am. Chem. Soc. 133(29), 11378–11388 (2011).
Article PubMed CAS Google Scholar
Balahmar, N., Al-Jumialy, A. S. & Mokaya, R. Biomass to porous carbon in one step: Directly activated biomass for high performance CO 2 storage. J. Mater. Chem. A 5(24), 12330–12339 (2017).
Article CAS Google Scholar
Hirst, E. A., Taylor, A. & Mokaya, R. A simple flash carbonization route for conversion of biomass to porous carbons with high CO 2 storage capacity. J. Mater. Chem. A 6(26), 12393–12403 (2018).
Article CAS Google Scholar
Zhang, Z. et al. In-situ ion-activated carbon nanospheres with tunable ultramicroporosity for superior CO2 capture. Carbon 143, 531–541 (2019).
Article CAS Google Scholar
Choma, J. et al. Highly microporous polymer-based carbons for CO 2 and H 2 adsorption. RSC Adv. 4(28), 14795–14802 (2014).
Article ADS CAS Google Scholar
de Souza, L. K. et al. Enhancement of CO2 adsorption on phenolic resin-based mesoporous carbons by KOH activation. Carbon 65, 334–340 (2013).
Article Google Scholar
Park, J. et al. Sustainable nanoporous carbon for CO2, CH4, N2, H2 adsorption and CO2/CH4 and CO2/N2 separation. Energy 158, 9–16 (2018).
Article CAS Google Scholar
Li, J. et al. Selective preparation of biomass-derived porous carbon with controllable pore sizes toward highly efficient CO2 capture. Chem. Eng. J. 360, 250–259 (2019).
Article ADS CAS Google Scholar
Ma, X. et al. Experimental and theoretical demonstration of the relative effects of O-doping and N-doping in porous carbons for CO2 capture. Appl. Surf. Sci. 481, 1139–1147 (2019).
Article ADS CAS Google Scholar
Haffner-Staton, E., Balahmar, N. & Mokaya, R. High yield and high packing density porous carbon for unprecedented CO 2 capture from the first attempt at activation of air-carbonized biomass. J. Mater. Chem. A 4(34), 13324–13335 (2016).
Article CAS Google Scholar
Khodabakhshi, S. et al. Interplay between oxygen doping and ultra-microporosity improves the CO2/N2 separation performance of carbons derived from aromatic polycarboxylates. Carbon 173, 989–1002 (2021).
Article CAS Google Scholar
Coromina, H. M., Walsh, D. A. & Mokaya, R. Biomass-derived activated carbon with simultaneously enhanced CO 2 uptake for both pre and post combustion capture applications. J. Mater. Chem. A 4(1), 280–289 (2016).
Article CAS Google Scholar
Mendoza-Castillo, D. I. et al. Insights and pitfalls of artificial neural network modeling of competitive multi-metallic adsorption data. J. Mol. Liq. 251, 15–27 (2018).
Article CAS Google Scholar
Izadi, M., Rahimi, M. & Beigzadeh, R. Evaluation of micromixing in helically coiled microreactors using artificial intelligence approaches. Chem. Eng. J. 356, 570–579 (2019).
Article CAS Google Scholar
Mazloom, M. S. et al. Artificial intelligence based methods for asphaltenes adsorption by nanocomposites: Application of group method of data handling, least squares support vector machine, and artificial neural networks. Nanomaterials 10(5), 890 (2020).
Article PubMed PubMed Central CAS Google Scholar
Mashhadimoslem, H. et al. Machine learning modelling and evaluation of jet fires from natural gas processing, storage, and transport. Can. J. Chem. Eng. 20, 20 (2023).
Google Scholar
Pashaei, H., Mashhadimoslem, H. & Ghaemi, A. Modeling and optimization of CO2 mass transfer flux into Pz-KOH-CO2 system using RSM and ANN. Sci. Rep. 13(1), 4011 (2023).
Article ADS PubMed PubMed Central CAS Google Scholar
Mohd Najib, N. A. et al. Artificial neural network (ANN) modelling of palm oil mill effluent (POME) treatment with natural bio-coagulants. Environ. Processes 7(2), 509–535 (2020).
Article CAS Google Scholar
Nguyen, T.-A. et al. On the training algorithms for artificial neural network in predicting the shear strength of deep beams. Complexity 2021, 1–18 (2021).
Google Scholar
Gopalakrishnan, K. Effect of training algorithms on neural networks aided pavement diagnosis. Int. J. Eng. Sci. Technol. 2(2), 83–92 (2010).
Article Google Scholar
Hosseinzadeh Talaee, P. Multilayer perceptron with different training algorithms for streamflow forecasting. Neural Comput. Appl. 24, 695–703 (2014).
Article Google Scholar
Anushka, P., Md, A. H. & Upaka, R. Comparison of different artificial neural network (ANN) training algorithms to predict the atmospheric temperature in Tabuk, Saudi Arabia. Mausam 71(2), 233–244 (2020).
Article Google Scholar
Saravanan, A. & Nagarajan, D. P. Performance of ANN in pattern recognition for process improvement using levenberg-marquardt and quasi-newton algorithms. IOSR J. Eng. 3(3), 08–13 (2013).
Article Google Scholar
Mukherjee, I. & Routroy, S. Comparing the performance of neural networks developed by using Levenberg–Marquardt and Quasi-Newton with the gradient descent algorithm for modelling a multiple response grinding process. Expert Syst. Appl. 39(3), 2397–2407 (2012).
Article Google Scholar
Shi, J. et al. Application of Bayesian regularization artificial neural network in explosion risk analysis of fixed offshore platform. J. Loss Prevention Process Ind. 57, 131–141 (2019).
Article Google Scholar
Karimi, M. et al. Smart computing approach for design and scale-up of conical spouted beds with open-sided draft tubes. Particuology 55, 179–190 (2021).
Article Google Scholar
Pauletto, P. S., Dotto, G. L. & Salau, N. P. G. Optimal artificial neural network design for simultaneous modeling of multicomponent adsorption. J. Mol. Liq. 320, 114418 (2020).
Article CAS Google Scholar
Kang, G. et al. Effect of pressure and temperature on CO2/CH4 competitive adsorption on kaolinite by Monte Carlo simulations. Materials 13(12), 2851 (2020).
Article ADS PubMed PubMed Central CAS Google Scholar

Download references

Author information

Authors and Affiliations

School of Chemical, Petroleum and Gas Engineering, Iran University of Science and Technology, Tehran, 16765-193, Iran
Pardis Mehrmohammadi & Ahad Ghaemi

Authors

Pardis Mehrmohammadi
View author publications
You can also search for this author in PubMed Google Scholar
Ahad Ghaemi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.M. conceptualization, methodology, software, conceived and designed the experiments, validation, formal analysis, investigation, resources, data curation, writing—review and editing. A.G.: supervision, funding acquisition, methodology, conceived and designed the experiments, formal analysis, investigation, resources, data curation, writing—original draft.

Corresponding author

Correspondence to Ahad Ghaemi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mehrmohammadi, P., Ghaemi, A. Investigating the effect of textural properties on CO₂ adsorption in porous carbons via deep neural networks using various training algorithms. Sci Rep 13, 21264 (2023). https://doi.org/10.1038/s41598-023-48683-4

Download citation

Received: 15 July 2023
Accepted: 29 November 2023
Published: 02 December 2023
DOI: https://doi.org/10.1038/s41598-023-48683-4
Springer Nature Limited

Investigating the effect of textural properties on CO2 adsorption in porous carbons via deep neural networks using various training algorithms

Abstract

Similar content being viewed by others

Machine learning analysis and prediction of N2, N2O, and O2 adsorption on activated carbon and carbon molecular sieve

Direct prediction of gas adsorption via spatial atom interaction learning

A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks

Introduction

Data gathering and preparation

Theory and methodology

Artificial neural networks

MLP training algorithms

Adaptive momentum

Self-adaptive learning rate

Resilient backpropagation (RP)

Conjugate gradient

Quasi-Newton

Levenberg–Marquardt backpropagation (LM)

Bayesian regularization backpropagation (BR)

Development of optimal ANN structure

Results and discussion

Best MLP model

Best RBF model

Prediction of CO2 adsorption with new data

Comparing MLP and RBF

Evaluation of adsorption factors

Conclusion

Data availability

Abbreviations

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation

Investigating the effect of textural properties on CO₂ adsorption in porous carbons via deep neural networks using various training algorithms

Prediction of CO₂ adsorption with new data