Abstract
The success of anaerobic digestion (AD) process for biogas production is contingent upon complex mix of operating factors, process conditions, and feedstock types, which could be affected by inadequate understanding of microbial, kinetic, and physicochemical processes. To address these limitations, efforts have been directed toward developing mathematical and intelligent models. Although mathematical models provide near-optimal solutions, they are time consuming, highly expensive, and demanding. Intelligent standalone models are also limited by their low predictive capability and inability to guarantee global optimal solution for the prediction of cumulative biogas yield for FFV waste. However, hyperparameter optimization of such models is essential to improve the prediction performance for cumulative biogas yield for FFV waste. Therefore, this study applies a genetic algorithm (GA) to optimize an adaptive neuro-fuzzy inference system (ANFIS) for the prediction of cumulative biogas production. Seven (7) input variables, organic loading rate (OLR), volatile solids (VS), pH, hydraulic retention time (HRT), temperature, retention time, and reaction volume, were considered with cumulative biogas production as the output. The effect of varying clustering techniques was evaluated. The three (3) clustering techniques evaluated are fuzzy c-means and subtractive clustering and grid partitioning. The hybrid model was evaluated based on some verified statistical performance metrics. Optimal root mean squared error (RMSE), mean absolute deviation (MAD), mean absolute percentage error (MAPE), and standard deviation error (error STD) of 0.0529, 0.0326,7.6742, and 0.0474, respectively, were reported at the model testing phase for the subtractive clustering technique being the best-performing model. The results confirm the capacity of hybrid evolutionary (genetic) algorithm based on subtractive clustering technique to predict the biogas yield from FFV and serve as an effective tool for the upscaling of anaerobic digestion units as well as in techno-economic studies toward more efficient energy utilization.
Graphical abstract
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
The beneficiation of waste for power generation has gained a significant momentum across the globe. Every year, billions of tons of wastes are generated around the world. By 2050, the global community would have generated 3.40 billion tons of garbage annually, reflecting a drastic increase from the current value of 2.01 billion tons [1]. In a United Nations (UN) report, 17% of the global annual food production, estimated at 1.0 billion tons, has been wasted as of 2019 [2]. The breakdown shows that 61%, 26%, and 13% of food waste were from households, food services, and retail, respectively [2]. To add further context, the waste generated in the USA only, which is disposed of yearly, can be valued at $408 billion [3]. In terms of greenhouse gas emissions, carbon footprint associated with global food wastage has been estimated at 3.3 billion tons of CO2 [4]. With the growth in population and food processing industries, the amount of waste is likely to increase. It has also been suggested that waste will increase with industrialization and urbanization [5].
Millions of tons of food, fruit, and vegetable (FFV) wastes have been channelled to landfills accounting for nearly 50% of the fruits and vegetables produced globally on yearly basis [6]. FFV waste represents not only product waste along the full value chain but also other associated resources including the land, water, labor, and energy used in the production of such food products. It significantly contributes to climate change since greenhouse gases are emitted during food production and distribution. Landfill or incineration approaches to FFV disposal are not advisable because of fast biodegradability of FFV in the presence of contaminating microbes. Moreover, the methane produced in the landfill may not serve any useful purpose except it is captured and utilized as a clean energy source rather than allowing it to leak into the air or dispersed as waste leading to severe environmental pollution. Food, fruit, and vegetable (FFV) wastes are rich in organic substances making it a good candidate for anaerobic digestion. While the overarching target is to mitigate waste or at least reduce waste generation, its beneficiation for value-added products could be the sustainable solution. When FFV is beneficiated, our climate benefits in two ways: landfill emission is prevented, while the fossil fuel that may have been used for energy generation is reduced or ruled out. FFV as a low-cost waste could be beneficially deployed in anaerobic digestion (AD) for energy application and value-added co-products toward better enhancement of circular economy. It has been traditionally deployed as an effective, sustainable, and environmentally friendly technology for transforming liquid or organic waste into biogas and other value-added products such as organic fertilizer. The biogas produced in anaerobic digestion process can be directly converted to electricity and heat.
AD is a complex process involving microbial consortia with numerous metabolic processes and kinetic reactions leading to CO2, N2, and CH4 production [7]. Other gases such as H2S and NH3 are also produced in traces depending on feedstock characteristics and operating conditions. The generation of biogas from FFV proceeds in four stages which include hydrolysis, acidogenesis, acetogenesis, and methanogenesis [8, 9]. At the hydrolysis stage, the complex organic compounds such as lipids, protein, and polysaccharides are converted into soluble monomers or oligomers. Further, during acidogenesis, the sugars, fatty acids, and amino acids generated during hydrolysis are deployed to produce organic acids such as acetic, fatty acids, propionic, butyric, hydrogen, and CO2 via the activities of fermentative anaerobic bacteria. The action of bacteria makes acidogenesis the fastest process in anaerobic digestion. Alcohol and volatile fatty acids are anaerobically oxidized into acetate, H2S, and CO2 during acetogenesis. This process takes place in the presence of hydrogen-producing acetogenic bacteria. At the end of this stage, the methanogens (acetotrophic and hydrogenotrophic) covert acetate, H2, and CO2 into a mixture of CH4 and CO2. The acetotrophics produce around 70% of methane, while hydrogenotrophic produces more energy that acetate pathway since it is not limited by rate. It has been reported that methane-producing methanogens are very sensitive to environmental changes, but hydrogenotrophic methanogens can provide more resistance to the environmental changes [10, 11]. Some factors affect the AD process; these include pH, temperature, C:N ratio, organic loading rate (OLR), hydraulic retention time (HRT), and nutrients [12, 13]. The microbes are sensitive to pH since each group survive at different ranges. If the partial pressure of hydrogen increases, the methanogenesis phase of the anaerobic process may fail because of the accumulation of volatile fatty acids and reduction in pH [13]. Microbes which act during AD are equally sensitive to temperature [14, 15]. For instance, under mesophilic temperature, the activity and growth rate of bacteria decrease by 50% for each 10 °C drop in temperature. If the temperature is increase up to 37 °C, the time required for digestion process is reduced, up until a point where further increase in temperature leads to reduction in biogas yield. The biogas production reduces when temperatures decreases to 20 °C and production stops entirely at 10 °C [14, 16]. Nutrients are added to a digester to support the process through the provision of necessary bacteria is required for biodegradation. Also, it is required that the proper composition of feed be maintained so as to keep the C:N ratio at the appropriate level, because low C:N ratio may result in ammonium inhibition especially for nitrogen-rich substrates [17]. Apart from temperature, total solid (TS) and organic loading rate (OLR) are critical to a stable operation of AD process [16]. The loading rate assists in the determination of the amount of FFV feedstock required to be added in a digester on daily basis subject to the size of the digester. Hydraulic retention time measures the time the solids or slurry spend in the digester during the AD process. Depending on the types of substrate and the climatic condition, HRT can go as high as 100 days [18, 19]. Apart from the aforementioned operating parameter, several kinetic and stoichiometric parameters are associated with microbial growth and chemical reaction [20, 21]. Moreover, anaerobic digestion is more vulnerable to process instability due to substantial dissimilarity in feedstock composition and unpredictability of microbial activities [22]. This further complicates the AD process.
Although the FFV are readily available for AD, their degradation process could be very complex because of their characteristics. As a result of the complexity of this process, mathematical prediction is highly challenging, though AD is theoretically well understood [23]. Moreover, the conventional analytical techniques are time consuming, highly expensive, and demanding in term of the equipment. Therefore, there is a need for modelling approaches which can provide dynamic information regarding the AD process condition. Artificial intelligent models are data-driven techniques which consider the physical processes or systems as a black box from input and output measurements. This ensures high predictive capability based on observations [24]. AI can provide more superior techniques compared to theoretical or mathematical approaches for complex systems given multiple parameters and non-linear dependency which influences the process [23]. Artificial intelligent (AI) models have been profitably applied in predicting biogas yield due to their ability to generalize and learn complex input-output relationships [25,26,27,28,29]. Several AI models have been used to model AD processes, though only few exist for the prediction of biogas yield from FFV waste [7, 30,31,32]. Kanat and Saral [33] and Yetilmezsoy et al. [34] developed models for the prediction of biogas production from molasses wastewater using ANN. They noted the ability of ANN in determining the interdependence in an AD process without prior awareness of the mathematical principles or governing equations. Good prediction result was obtained bases on statistical metrics. In a study carried out by Neto et al. [35], the effects of seven (7) critical operating parameters were evaluated and deployed to predict biogas yield using an artificial neural network (ANN). Before this, most existing studies majorly consider one, two, or three operating factors despite the importance of others in extracting valuable information for the prediction and optimization of the AD processes [36,37,38]. The major limitation with the deployment of ANN is associated with its inability to guarantee global optimal solution and difficulties in knowledge representation [39, 40]. This is almost unavoidable considering the fundamental black box processing paradigm and several topologies which exist in neural computing [40]. In that case, the prediction of FFV biogas production can benefit from the deployment of a system which combines the advantage of ANN and fuzzy systems with evolutionary algorithms. Fuzzy systems can represent comprehensive linguistic knowledge while reasoning through fuzzy rules, though it does not have the mechanism for parameters tuning [40, 41]. On another note, neural networks can be trained and tuned from a set of input-output data stream. The combination of fuzzy system and neural network give rise to adaptive neuro-fuzzy inference system (ANFIS). ANFIS is very robust with ability to learn neural networks while modelling the uncertainty, linguistic concepts, and expert knowledge. NFIS model can adapt to variation in system conditions, control noisy data, and quickly model the system with low computation resources and non-linear process structures [42]. ANFIS is a prediction technique used in numerous fields of study in bioenergy exploration and conversion due to its capacity to map input-output inside a solution space so that local optimal values are avoided while considering fuzzy factors [43, 44]. While building an ANFIS model, the choice of clustering technique must be thoroughly considered given its impact on the accuracy and precision of the model [45]. Clustering methods help in the identification of group where an observation belong; thus, unsuitable clustering approach may reduce the accuracy of the models [45, 46].
The genetic algorithm (GA), as an evolutionary technique, has been extensively used in different fields [47,48,49]. Its preference over other evolutionary-based techniques has been associated with its ability to deal with complex problems and parallelism. Evolutionary genetic algorithms (GA) are applicable to any optimization problem: stationary or non-stationary, continuous or discontinuous, linear or non-linear, or random with noise [50]. GA optimization method stems from the Darwin Theory of Evolution which focuses on natural selection and survival of the fittest in biological genetics [51]. The main objective of the approach focuses on reproducing offspring with improved genetic fitness than their progenitors. This principle of evolution is deployed through reproduction, crossover, and mutation. For hyperparameter optimization, the GA technique optimizes the base model hyperparameters according to an objective function within the solution search space until convergence on improved solution is reached. With this prowess, the GA method has been deployed as a hybrid of other intelligent predictive models, leading to an improved prediction accuracy and error minimization via parameter optimization [25, 52]. A hybrid model comprising ANFIS optimized with GA promises better results by minimizing the curse of dimensionality and internal loss of interpretability of the model when used on large input datasets [53].
From the existing studies, hybridization of GA and ANFIS model has been successfully adopted for widespread application in order to improve the prediction and optimization capabilities. However, to the best of authors’ knowledge, the use of GA-ANFIS including the investigation of the effect of clustering techniques for the prediction of biogas yield from FFV has not been previously reported. The closest application was in the prediction of the heating value of biomass [54]. Therefore, building on the advantages of adaptive neuro-fuzzy inference system (ANFIS) functionality [38], this study applies GA model to predict the cumulative biogas production from FFV waste. Accordingly, the specific objectives of the present study are: (1) to develop GA-ANFIS model for the prediction of biogas yield from FFV; (2) to investigate the effect of clustering techniques on the performance of the developed model; and (3) to compare the developed predictive models based on several statistical performance indicators. The proposed model utilizes feeding, VS, pH, HRT, OLR, temperature, and reactor volume data as input variables. Sensitivity analysis was carried out to determine the relative contribution of each input parameter toward the prediction of output. The results of these models were compared with previously proposed model based on known performance metrics.
2 Materials and methodology
2.1 Overview of ANFIS model
Binary logic techniques often fail to closely approximate complex and non-linear problems. This is largely due to the insufficient knowledge or judgment error associated with human experts and the dynamic nature of the system. In this case, adaptive neuro-fuzzy inference system (ANFIS) has an advantage since it combines fuzzy inference linguistic transparency with self-learning capability of neural network. In ANFIS modelling, the fuzzy inference system and artificial neural network can be combined according to the Mamdani system [55] and Takagi-Sugeno system [56] topologies. Takagi-Sugeno system is more computation efficient, amenable to rule generation alongside the optimization technique of ANN while ensuring the continuity of the output space [57]. Succinctly, an ANFIS structure is a combination of information obtained from a fuzzy logic system and an artificial neural network. It comprises numerous membership function (MF) parameters tuned using optimization methods [58]. A typical ANFIS structure is a five-layer topology, each of which has a number of nodes defined by logical sets of statements. These logical statements perform specific tasks [26]. Specifically, the ANFIS structure comprises the fuzzification layer, the rule layer, the normalization layer, the de-fuzzification layer, and the total output layer, as discussed by Adedeji et al. [43] and Miller et al. [59]. Once the links between the layers have been established, the previous layer’s outputs are used as the inputs of the next layer. A typical ANFIS structure with two input parameters, x and y, and a single output parameter fi, is governed by the rules expressed in Eqs. (1) and (2):
where fuzzy terms are represented by M and N and fi(x, y) is a first-order Takagi-Sugeno fuzzy model with f1 and f2 indicating the fuzzy-if-then rules. The details of the fuzzy layers and their mathematical expressions have been discussed elsewhere [43, 60].
The five layers of ANFIS model are shown in Fig. 1 and are briefly discussed. Each nodes of the layers performs different functions, and they are optimized through learning processes. The node-to-node connection lines show the flow direction and do not imply any weight.
2.1.1 Fuzzification layer
This is the first ANFIS layer where each neuron is adaptive such that individual weights are updated in the course of learning. In this case, the input parameters are expressed by the Gaussian membership functions with a node output expressed by Eqs. (3) and (4). Other membership functions can be used in addition to the Gaussian membership function such that the parameter with lowest error is selected in the learning process as shown in Eqs. (3) and (4):
where μBi − 2 and μAi are fuzzy membership functions.
2.1.2 Rule layer
This layer are non-adaptive, and each rule has a firing strength whose value is estimated using a simple multiplication [3]. The nodes in this layer receive the incoming signals from the “IF” part of the fuzzy rule, and it then outputs it, by using multiplication operator. This output, as expressed in Eq. (5), represents the fitness of the fuzzy rule.
2.1.3 Normalization layer
This is the third layer of ANFIS and is also called the summation layer. This layer computes the normalized firing strength of the nodes as expressed in Eq. (6):
2.1.4 Defuzzification layer
The fourth layer is a defuzzification layer and its nodes are adaptive. A first-order polynomial function is used to multiply the normalized firing strength of each rule as shown Eq. (7):
where pi, qi, ri is a parameter set of the node and \(\overline{w_i}\) is the normalized firing strength of the third layer.
2.1.5 Output layer
This is the last layer of ANFIS, and it is also called summation output layer. This layer has a single non-adaptive node that summarizes all the incoming signals to produce output. The overall output is computed as shown in Eq. (8). This value is continuous in nature rather than a fuzzy set.
The performance of a fuzzy logic system is linked to how well the membership functions are normalized. Moreover, a fuzzy logic applies the correlation between the antecedents and consequents, including linguistic variables, to produce a specific output. Data clustering is an essential process used in assigning membership functions such that the tuned membership functions are generated according to the expert system’s knowledge. Three (3) clustering techniques were deployed in this study, and they are briefly discussed below. Detailed information about these techniques can be obtained from the studies conducted by Adedeji et al. [43] and Rao et al. [61];
2.1.6 Fuzzy c-mean clustering
Fuzzy c-means (FCM) is a data clustering approach that divides a dataset into several clusters. Each data point in the dataset belongs to each cluster to a varying degree. FCM was first presented by Dunn [62], but it was enhanced by Bezdek [63]. Each data point may belong to more than one cluster using this technique. However, the number of clusters must be determined depending on assumptions made in advance. In the FCM clustering, U is a characteristic matrix for the membership of each element in each cluster. Therefore, c-means objective function Jm(U, v) is defined such that the clustering algorithm minimizes the objective function as presented in Eq. (9).
where
- D(i, j)2 = ‖Xi − Vj ‖:
-
is the squared distance between the element Xi and cluster Vj
- V j:
-
the centroid of cluster j
- C:
-
number of clusters | 2 ≤ C < n
- m:
-
fuzzification index of the algorithm
- U(i, j)m:
-
degree of membership
2.1.7 Subtractive method
Subtractive clustering is a fast, one-pass algorithm used in estimating the number of clusters and the centers for a set of data [64]. The subtractive clustering (SC) algorithm is utilized to automatically generate the tuned membership functions in accordance with the domain. For this intent, the radius that determines the cluster’s influence in the space is specified [61]. The size of the cluster would be small if the radius of the cluster is set to be too small, thus increasing the number of clusters. On the other hand, the size of the cluster would be large, if the radius of the cluster is set to be large, thus reducing the number of the clusters. The cluster formulation is based on density calculated by Eq. (10):
where rais the radius ra. The greatest density point is chosen as the first cluster center xc1a; after this, the density measure of each data point xiin the next iteration is obtained in Eq. (11):
The iteration process continues until a sufficient number of clusters are achieved, and all the data points fall within the radii of a cluster center.
2.1.8 Grid partitioning (GP)
Grid partitioning is often deployed when the dimension of the input space is small. As the number of input parameters increases, the number of membership functions increases exponentially, posing a significant limitation to the performance. GP is different from subtractive clustering and fuzzy c-means, because GP uses similar membership functions on the input space to generate identical partition within the symmetric function [43]. Fuzzy rules can be generated from input-output dataset deployed during the training. This allows rapid learning process and optimization of computation time (CT). The number of the fuzzy if-then rules is equal to Mn such that n is the input dimension and M is the number of fuzzy subsets that is partitioned for each input variables. The performance of this clustering technique significantly depends on the size of the input and the grid. The finer grid typically performs better, though adaptive grid partitioning can be deployed to optimize the size and location of the fuzzy grid regions [65]. However, the GP technique is limited by the exponential explosion of the numbers of membership functions as the number of the input parameters increases [66].
2.2 Data collection and processing
The 864 dataset deployed in this study was developed by Neto et al. [35]. Specifically, the data was from the experimental study carried out by the aforementioned authors and several other authors whose works have been published in reputable journals. The data were gathered across different season in different geospatial location under varying conditions [35]. It covers a wide array of actual output and input parameters that dictate the behavior of FFV wastes. In this case, seven (7) variables considered as the inputs are organic loading rate (OLR), volatile solids (VS), pH, hydraulic retention time (HRT), temperature, retention time, and reaction volume, while cumulative biogas production was the output. The impacts of individual input variable on the output value were established based on sensitivity analysis. This was done to determine the influential variables, which significantly contribute to the prediction of cumulative biogas yield. The type of reactors (anaerobic sequencing batch reactor (ASBR) and continuous stirred tank reactor (CSTR)) and feeding (semi-continuous and continuous) as well as the number of stages in the digesters (one, two, or three) were described by discrete parameters. The data collected from the database were randomized and subsequently divided in a ratio of 7:3 for training and testing, respectively. The statistical distribution of the FFV waste is presented in Table 1. The training performance is expressible through a comparison between the actual and predicted data in this step.
2.3 Model implementation
A hybrid genetic evolutionary algorithm based on a fuzzy inference system was deployed using three separate clustering (fuzzy c-means, grid partitioning, and subtractive clustering) techniques. The decision to assess the impact of the clustering technique was based on the studies conducted by [15, 20], which suggested that the choice of hyperplane tuning parameters such as clustering method affects the model’s performance. Figure 2 shows the flow diagram for the estimation of cumulated biogas yield. The GA algorithm starts by initializing with randomly generated population. In the course of each successive generation, a percentage of the existing population is selected to breed a new generation. The population is ranked, and the final solution is then selected through fitness-based procedure. In this study, Roulette wheel selection techniques were used since it has been identified as the most efficient in parent selection. The crossover of fittest parents is performed to produce new offspring which reflects the attributes of the pairing parents. The crossover process is followed by mutation where new solution is searched within the available search space to obtain revolutionary results which could help in arriving at efficient solution. After the optimal solution has been selected, a clustering technique is deployed as per the requirement of ANFIS model; then, the model is trained using the processed data. If the stopping criteria are met, then the model is tested with hold out data; otherwise, the training process is repeated until the stopping criteria are satisfied. The coding of the evolutionary genetic algorithms was actualized in MATLAB program version 2020a, and the software was installed on an 11th Gen Intel(R) Core (TM) i7-1165G7 CPU @ 2.80GHz laptop 32GB RAM, 2TB SSD machine. Iteration values of 1000, 800, and 600 were tested, but there was no discernible difference in the three cost functions. The training ends since no further progress is made and the maximum number of iterations has been reached. As a result, the shortest iteration time (600 iterations) was chosen for this investigation to reduce computational time.
Table 2 present the step-by-step procedure that was followed in the implementation of GA-ANFIS. It should be noted that the iterative process would continue until the stopping criteria are satisfied.
The genetic algorithm is hyperparameter-sensitive which makes tuning very important because the appropriate selection of these parameters could reduce the prediction errors [67]. Also, clustering techniques are optimally selected to enhance the overall performance of the model. Clustering techniques reveal the intrinsic relationship between the dataset [68, 69]. The learning and optimization parameters that were used in this study are shown in Table 3.
2.4 Model performance analysis
The models applied in this study were evaluated based on the relevant statistical metrics. The mathematical expression of these statistical metrics is presented in Table 4.
3 Results and discussions
This section discusses the results obtained during the testing and training phase of the model. The performance evaluation results were reported, while further comparison was drawn with the study conducted by Neto et al. [35]. The influence of the 7 input variables on the cumulative biogas production is evaluated as shown in Fig. 3. The sensitivity analysis indicates that all variables have influence on the cumulative biogas yield, though to different degrees. It can be seen that HRT and VS were most influential, contributing 35% and 31%, respectively, to the prediction of cumulative biogas production. Further, pH contributes 22% to the output; however, the reactor volume is the least influential in cumulative biogas production. The significance of HRT is due to its impact on other variables such as temperature and substrate composition [71, 72]. When the HRT is reduced, the activity of the bacteria is reduced, while bigger HRT would require larger digester leading to higher cost and low efficiency. The HRT changes with temperature with lower HRT in thermophilic temperature, while greater HRT is required in mesophilic temperature. Also, the VS concentration of the substrate could affect the biogas production. For instance, comparison of the influent and effluent of the digester can help in the determination of the feedstock degradation.
It is noteworthy that the microbial growth during anaerobic digestion could be affected by the pH. In a study conducted by Jayaraj et al. [73], effects of different pH values (5, 6, 7, 8, 9, 10) on biogas yield from food waste after 30 days of retention were investigated. pH 7 produced better biogas yield and bacteria growth. There is a relation between the feeding rate and the OLR. Higher OLR at higher feeding rate can produce larger biogas volume, and vice versa [74]. OLR also affect the microbial population as well the reactor performance. It is critical to optimize OLR since increase in its value may produce acidification effect and subsequent reduction or stoppage of biogas production, while decrease in OLR may reduce the biogas production efficiency [75]. Temperature affects the HRT, OLR, VS, microbial growth, and the cumulative biogas yield. The rate of biogas production is enhanced at higher temperature [12, 76]. When the operation temperature changes, the biogas yield would also change [32, 77, 78].
After the effects of all variables have been verified, the GA-ANFIS model was successfully implemented in the MATLAB environment, and the resulting performance of each clustering techniques was further discussed. Figures 4, 5, and 6 show the experimental and predicted values based on SC, FC, and GP at the testing state. The prediction based on SC shows a strong relationship and satisfactory agreement with a significantly lower misprediction between the predicted and experimental values of biogas production. However, there were more notable instances of misprediction of biogas production in FC clustering technique. This may have been due to the unequal sizes and densities of the cluster [79]. The worse prediction scenario characterized by gross misprediction and underfitting was noted when grid partitioning clustering method was deployed. This affected the model accuracy as MAPE, R2, MAD, and RMSE reported were poor. Although several tunings were performed using different parameters to validate the GP result, the same prediction pattern was obtained. Similar observation was obtained by Adedeji et al. [69] in their short-term prediction of wind turbine power output. This may be due to high bias and dimensionality, which increased the complexity of the model and hamper its ability to learn the pattern of the training data. The better performance of SC may be as a result of its capability to automatically extract rules that fully account for the mobility and distribution of nodes. FCM, on the other hand, is highly sensitive to outliers in the biogas dataset, due to the Euclidean norm which measures the similarity between the center of the cluster and data points [66]. Moreover, the modelling was performed on a real-world FFV dataset, which contained noise and outliers to demonstrate the model’s effectiveness, efficiency, and robustness. Therefore, the sensitivity FC to noise and outliers may have been partly responsible for the poor performance [79, 80].
Apart from the visual observation of the agreement between the experimental and predicted biogas yield, statistical evaluation is performed, and the results presented are shown in Table 5. Generally, MAPE estimates the model’s forecast accuracy and fitness, while MAD and RMSE assess the magnitude of the average prediction error. Thus, it is preferred if the value of these metrics tends toward zero. Lower values of RMSE, MAPE, and standard deviation error indicate that the predictive model is more accurate and has less error. The SC clustering technique offered the best result of the three clustering methods at the training and testing stage for all evaluation metrics except for the computation time (CT). This could have resulted from SC clustering ability which can improve numerically different but similar data groups and sparse data points in a multidimensional space [81]. Although the computation time (CT) was 70 s, the MAPE forecast of the FC was significantly off by approximately 59%, which means only 41% of the data were correctly predicted. The poor statistical performance of GP shows that global optimal value could not be attained despite the optimization technique.
Since the data used in this study were sourced from Neto et al. [35], the results from the current study were compared with the biogas production models based on the determination coefficient (R2). These are presented in Table 6. For the ANN models from Neto et al. [35], biogas production was predicted using gradient descent algorithm (Traindx), Bayesian regularization backpropagation (Trainbr), and Levenberg and Marquardt function (Trainlm). However, for all these ANN variants, SC-based evolutionary algorithms produced better performance result for cumulative biogas production since its R2 only deviates from 1 by 0.0004%. On the contrary, all the ANN models deployed by Neto et al. [35] performed better than the Fuzzy c-means clustering and GP methods deployed in this study with GP significantly closer to zero. The R2 value (R2 =0.1872) based on GP method suggests that the model fitting of the data is poor. The likely cause maybe the high dimensionality of the dataset. In addition, it suggests than the model could not explain the majority of the dataset. Therefore, it is reasonable to conclude that the evolutionary algorithm based on subtractive clustering technique outperforms other clustering techniques in the prediction of cumulative biogas yield.
4 Conclusions
Hybrid evolutionary (genetic) algorithm based on an adaptive neuro-fuzzy inference system (ANFIS) was applied to predict biogas yield from FFV. Three (3) clustering techniques (SC, FC, and GP) were considered, and the sensitivity of the input variables was evaluated. The sensitivity analysis indicates that all variables have influence on the cumulative biogas yield, with HRT and VS being the most influential, contributing 35% and 31%, respectively, to the prediction of cumulative biogas production. Also, the result achieved in the study demonstrates the effect of the clustering technique on the optimization of biogas production. The application of a hybrid genetic algorithm using a subtractive clustering approach offered highly satisfactory concurrence with the experimental data for biogas production. The statistical performance metrics for training and testing phases indicated that evolutionary ANFIS based on SC could reasonably predict the cumulative biogas yield with high accuracy and low error. It also provides better reliability than other models reviewed in this study. The results confirm the capacity of hybrid evolutionary (genetic) algorithm based on subtractive clustering technique to predict the biogas yield from FFV and serve as an effective tool for the upscaling of anaerobic digestion units as well as in techno-economic studies toward more efficient energy utilization.
Data availability
The data that support this study is available to the public with identifier(s) https://doi.org/10.1016/j.fuel.2020.119081.
References
WorldBank (2022) What a waste global database. https://datacatalog.worldbank.org/search/dataset/0039597. Accessed on 28 May 2022
U. Nations (2021) Stop food loss and waste, for the people, for the panet. https://www.un.org/en/observances/end-food-waste-day#:~:text=Globally%2C%20around%2014%20percent%20of,and%202%20percent%20in%20retail). Accessed on 10 Feb 2022
F. America (2022) How we fight food waste in the US. https://www.feedingamerica.org/our-work/our-approach/reduce-food-waste#:~:text=How%20much%20food%20waste%20is,food%20thrown%20away%20each%20year. Accessed on 09 Feb 2021
FAO (2021) Food wastage footprint. https://www.fao.org/news/story/en/item/196402/icode/
Li R (2022) Integrating the composition of food waste into the techno-economic analysis of waste biorefineries for biodiesel production. Bioresour Technol Rep 20:101254
UNEP (2022) Worldwide food waste. https://www.unep.org/thinkeatsave/get-informed/worldwide-food-waste. Accessed on 28 May 2022
Beltramo T, Klocke M, Hitzmann B (2019) Prediction of the biogas production using GA and ACO input features selection method for ANN model. Inform Process Agric 6(3):349–356
Rotaru A-E et al (2014) A new model for electron flow during anaerobic digestion: direct interspecies electron transfer to Methanosaeta for the reduction of carbon dioxide to methane. Energy Environ Sci 7(1):408–415
Chew KR et al (2021) Effects of anaerobic digestion of food waste on biogas production and environmental impacts: a review. Environ Chem Lett 19(4):2921–2939
Mutungwazi A, Ijoma GN, Matambo TS (2021) The significance of microbial community functions and symbiosis in enhancing methane production during anaerobic digestion: A review. Symbiosis 83(1):1–24
Parawira W (2004) Anaerobic treatment of agricultural residues and wastewater-application of high-rate reactors. Lund University
Induchoodan T, Haq I, Kalamdhad AS (2022) Factors affecting anaerobic digestion for biogas production: a review. Adv Organic Waste Manage :223–233
Maile I, Muzenda E, Mbohwa C (2016) Optimization of biogas production through anaerobic digestion of fruit and vegetable waste: a review. In 2016 7th International Conference on Biology, Environment and Chemistry, vol. 98
Cioabla AE, Ionel I, Dumitrel G-A, Popescu F (2012) Comparative study on factors affecting anaerobic digestion of agricultural vegetal residues. Biotechnol Biofuels 5(1):1–9
Babaei A, Shayegan J (2019) Effects of temperature and mixing modes on the performance of municipal solid waste anaerobic slurry digester. J Environ Health Sci Eng 17(2):1077–1084
Ossa-Arias MDM, González-Martínez S (2021) Methane production from the organic fraction of municipal solid waste under psychrophilic, mesophilic, and thermophilic temperatures at different organic loading rates. Waste Biomass Valorization 12(9):4859–4871
Xiao Y, Zan F, Zhang W, Hao T (2022) Alleviating nutrient imbalance of low carbon-to-nitrogen ratio food waste in anaerobic digestion by controlling the inoculum-to-substrate ratio. Bioresour Technol 346:126342
Salminen EA, Rintala JA (2002) Semi-continuous anaerobic digestion of solid poultry slaughterhouse waste: effect of hydraulic retention time and loading. Water Res 36(13):3175–3182. https://doi.org/10.1016/S0043-1354(02)00010-6
Deng L et al (2016) Effects of hydraulic retention time and bioflocculant addition on membrane fouling in a sponge-submerged membrane bioreactor. Bioresour Technol 210:11–17. https://doi.org/10.1016/j.biortech.2016.01.056
Curletto C, Bulla L, Canovi L, Demicheli F, Venturino E (2023) A mathematical investigation for the simulation and forecasting of a biodigester operations. Math Comput Simul 209:118–152
Smaluch K, Wollenhaupt B, Steinhoff H, Kohlheyer D, Grünberger A, Dusny C (2023) Assessing the growth kinetics and stoichiometry of Escherichia coli at the single-cell level. Eng Life Sci 23(1):e2100157
Ajayi-Banji A, Rahman S (2022) A review of process parameters influence in solid-state anaerobic digestion: Focus on performance stability thresholds. Renew Sustain Energy Rev 167:112756
De Clercq D, Wen Z, Fei F, Caicedo L, Yuan K, Shang R (2020) Interpretable machine learning for predicting biomethane production in industrial-scale anaerobic co-digestion. Sci Total Environ 712:134574
Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Machine Intell 1(5):206–215
Olatunji OO, Akinlabi S, Madushele N, Adedeji PA (2021) A GA-ANFIS Model for the Prediction of Biomass Elemental Properties. Trends Manufact Eng Manage:1099–1114
Lakovic N et al (2021) Management of higher heating value sensitivity of biomass by hybrid learning technique. Biomass Convers Bioref:1–8
Phromphithak S, Onsree T, Tippayawong N (2021) Machine learning prediction of cellulose-rich materials from biomass pretreatment with ionic liquid solvents. Bioresour Technol:124642
Khatri N, Khatri KK (2022) Artificial intelligence for modeling and optimization of the biogas production. Artif Intell Renew Energy Syst:93–113
Arismendy L, Cárdenas C, Gómez D, Maturana A, Mejía R, Quintero MCG (2020) Intelligent system for the predictive analysis of an industrial wastewater treatment process. Sustainability 12(16):6348
Cruz IA et al (2022) Application of machine learning in anaerobic digestion: perspectives and challenges. Bioresour Technol 345:126433
Baitha R, Kaushal R (2019) Experimental and numerical study of biogas, methane and carbon dioxide produced by pre-treated wheat straw and pre-digested cow dung. Int J Sustain Eng 12(4):240–247
Baitha R, Kaushal R (2020) Numerical and experimental study of biogas, methane and carbon dioxide produced by pre-treated slurry. Int J Ambient Energy 41(2):198–204
Kanat G, Saral A (2009) Estimation of biogas production rate in a thermophilic UASB reactor using artificial neural networks. Environ Model Assess 14(5):607–614
Heydari B, Sharghi EA, Rafiee S, Mohtasebi SS (2021) Use of artificial neural network and adaptive neuro-fuzzy inference system for prediction of biogas production from spearmint essential oil wastewater treatment in up-flow anaerobic sludge blanket reactor. Fuel 306:121734
Neto JG, Ozorio LV, de Abreu TCC, Dos Santos BF, Pradelle F (2021) Modeling of biogas production from food, fruits and vegetables wastes using artificial neural network (ANN). Fuel 285:119081
Almomani F (2020) Prediction of biogas production from chemically treated co-digested agricultural waste using artificial neural network. Fuel 280:118573. https://doi.org/10.1016/j.fuel.2020.118573
Chong DJS, Chan YJ, Arumugasamy SK, Yazdi SK, Lim JW (2023) Optimisation and performance evaluation of response surface methodology (RSM), artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) in the prediction of biogas production from palm oil mill effluent (POME). Energy 266:126449. https://doi.org/10.1016/j.energy.2022.126449
Najafi B, Faizollahzadeh Ardabili S (2018) Application of ANFIS, ANN, and logistic methods in estimating biogas production from spent mushroom compost (SMC). Resourc Conserv Recycl 133:169–178. https://doi.org/10.1016/j.resconrec.2018.02.025
Fan M et al (2017) Artificial neural network modeling and genetic algorithm optimization for cadmium removal from aqueous solutions by reduced graphene oxide-supported nanoscale zero-valent iron (nZVI/rGO) composites. Materials 10(5):544
Alizadeh M, Lewis M, Zarandi MHF, Jolai F (2011) Determining significant parameters in the design of ANFIS. In: 2011 Annual Meeting of the North American Fuzzy Information Processing Society. IEEE, pp 1–6
Abraham A (2001) Neuro fuzzy systems: state-of-the-art modeling techniques. In: Connectionist Models of Neurons, Learning Processes, and Artificial Intelligence: 6th International Work-Conference on Artificial and Natural Neural Networks, IWANN 2001 Granada, Spain, June 13–15, 2001 Proceedings, Part 1 6. Springer, pp 269–276
Şahin M, Erol R (2017) A comparative study of neural networks and ANFIS for forecasting attendance rate of soccer games. Math Comput Appl 22(4):43
Adedeji PA, Akinlabi S, Madushele N, Olatunji OO (2020) Wind turbine power output short-term forecast: a comparative study of data clustering techniques in a PSO-ANFIS model. J Clean Prod:120135
Aghbashlo M et al (2021) Describing biomass pyrolysis kinetics using a generic hybrid intelligent model: a critical stage in sustainable waste-oriented biorefineries. Renew Energy 170:81–91
Oladipo S, Sun Y, Amole A (2022) Performance evaluation of the impact of clustering methods and parameters on adaptive neuro-fuzzy inference system models for electricity consumption prediction during COVID-19. Energies 15(21):7863
Al-Shammari ET et al (2016) Comparative study of clustering methods for wake effect analysis in wind farm. Energy 95:573–579
Rouhibakhsh K, Darvish H, Sabzgholami H, Goodarzi MS (2018) Application of ANFIS-GA as a novel and accurate tool for estimation of interfacial tension of carbon dioxide and hydrocarbon. Petroleum Sci Technol 36(15):1143–1149
Esfandyari M, Esfandyari M, Jafari D (2018) Prediction of thiophene removal from diesel using [BMIM][AlCl4] in EDS Process: GA-ANFIS and PSO-ANFIS modeling. Petroleum Sci Technol:1–7
Armaghani DJ, Mohamad ET, Hajihassani M, Yagiz S, Motaghedi H (2016) Application of several non-linear prediction tools for estimating uniaxial compressive strength of granitic rocks and comparison of their performances. Eng Comput 32(2):189–206
Yang X-S (2021) Chapter 6 - Genetic Algorithms. In: Yang X-S (ed) Nature-Inspired Optimization Algorithms, Second edn. Academic Press, pp 91–100
Adedeji PA, Olatunji OO, Madushele N, Ajayeoba AO (2021) Soft computing in renewable energy system modeling. In: Design, Analysis, and Applications of Renewable Energy Systems. Elsevier, pp 79–102
Moayedi H, Raftari M, Sharifi A, Jusoh WAW, Rashid ASA (2020) Optimization of ANFIS with GA and PSO estimating α ratio in driven piles. Eng Comput 36(1):227–238
Li H-A et al (2021) Neural network-based mapping mining of image style transfer in big data systems. Comput Intell Neurosci 2021
Baghban A, Ebadi T (2019) GA-ANFIS modeling of higher heating value of wastes: Application to fuel upgrading. Energy Sourc Part A: Recover Util Environ Effects 41(1):7–13
Mamdani EH, Assilian S (1975) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man-Machine Stud 7(1):1–13
Takagi T, Sugeno M (1985) Fuzzy identification of systems and its applications to modeling and control. IEEE Trans Syst Man Cyber 1:116–132
Yeom C-U, Kwak K-C (2018) Performance comparison of ANFIS models by input space partitioning methods. Symmetry 10(12):700
Karaboga D, Kaya E (2018) Adaptive network based fuzzy inference system (ANFIS) training approaches: a comprehensive survey. Artif Intell Rev:1–31
Miller DJ, Nelson CA, Cannon MB, Cannon KP (2009) Comparison of fuzzy clustering methods and their applications to geophysics data. Appl Comput Intell Soft Comput 2009
Olatunji OO, Adedeji PA, Madushele N, Akinlabi S, DiCarlo AA (2022) Modelling biomass elemental composition: a neurofuzzy approach. Proc Comput Sci 200:1736–1745. https://doi.org/10.1016/j.procs.2022.01.374
Rao UM, Sood Y, Jarial R (2015) Subtractive clustering fuzzy expert system for engineering applications. Proc Comput Sci 48:77–83
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters
Bezdek JC (2013) Pattern recognition with fuzzy objective function algorithms. Springer Science & Business Media
Chiu SL (1994) Fuzzy model identification based on cluster estimation. J Intell Fuzzy Syst 2(3):267–278
Benmouiza K, Cheknane A (2019) Clustered ANFIS network using fuzzy c-means, subtractive clustering, and grid partitioning for hourly solar radiation forecasting. Theor Appl Climatol 137(1):31–43
Chen M-S, Wang S-W (1999) Fuzzy clustering analysis for optimizing fuzzy membership functions. Fuzzy Sets Syst 103(2):239–254
Devaraj R, Mahalingam SK, Esakki B, Astarita A, Mirjalili S (2022) A hybrid GA-ANFIS and F-Race tuned harmony search algorithm for multi-response optimization of non-traditional machining process. Exp Syst Appl 199:116965
Tang R, Fong S (2018) Clustering big IoT data by metaheuristic optimized mini-batch and parallel partition-based DGC in Hadoop. Future Gen Comput Syst 86:1395–1412
Adedeji PA, Akinlabi S, Madushele N, Olatunji OO (2020) Wind turbine power output very short-term forecast: a comparative study of data clustering techniques in a PSO-ANFIS model. J Clean Prod 254:120135
Olatunji OO, Akinlabi S, Madushele N, Adedeji PA (2019) Estimation of the elemental composition of biomass using hybrid adaptive neuro-fuzzy inference system. BioEnergy Res 12(3):642–652
Sandhu S, Kaushal R (2022) Optimisation of anaerobic digestion of layer manure, breeding manure and cow dung using grey relational analysis. Biomass Convers Bioref:1–13
Tabatabaei M, Valijanian E, Aghbashlo M, Ghanavati H, Sulaiman A, Wakisaka M (2018) Prominent parameters in biogas production systems. Biogas: Fundamentals Process Operation:135–161
Yang L, Huang Y, Zhao M, Huang Z, Miao H, Xu Z, Ruan W (2015) Enhancing biogas generation performance from food wastes by high-solids thermophilic anaerobic digestion: Effect of pH adjustment. Int Biodeterior Biodegradation 105:153–159
Zealand A, Roskilly A, Graham D (2017) Effect of feeding frequency and organic loading rate on biomethane production in the anaerobic digestion of rice straw. Applied Energy 207:156–165
Moriarty K (2013) Feasibility Study of Anaerobic Digestion of Food Waste in St. Bernard, Louisiana. In: A study prepared in partnership with the environmental protection agency for the RE-powering America's land initiative: siting renewable energy on potentially contaminated land and mine sites. National Renewable Energy Lab.(NREL), Golden, CO (United States)
Mao C, Feng Y, Wang X, Ren G (2015) Review on research achievements of biogas from anaerobic digestion. Renew Sustain Energy Rev 45:540–555
Hossain MS et al (2022) Impact of temperature, inoculum flow pattern, inoculum type, and their ratio on dry anaerobic digestion for biogas production. Sci Rep 12(1):6162
Kaushal R, Baitha R (2021) Biogas and methane yield enhancement using graphene oxide nanoparticles and Ca (OH) 2 pre-treatment in anaerobic digestion. Int J Ambient Energy 42(6):618–625
Memon KH, Lee D-H (2018) Generalised kernel weighted fuzzy C-means clustering algorithm with local information. Fuzzy Sets Syst 340:91–108
Song J, Cong W, Li J (2017) A Fuzzy C-means Clustering Algorithm for Image Segmentation Using Nonlinear Weighted Local Information. J Inf Hiding Multim Signal Process 8(3):578–588
Smith JN, Reece L, Szaniszlo P, Leary RC, Leary JF (2005) Subtractive clustering analysis: a novel data mining method for finding cell subpopulations. In: Imaging, Manipulation, and Analysis of Biomolecules and Cells: Fundamentals and Applications III, vol 5699. SPIE, pp 354–361
Funding
Open access funding provided by University of Johannesburg.
Author information
Authors and Affiliations
Contributions
Conceptualization, O.O; Data curation, P. A and O.O; Formal analysis, O.O., P. A, and Z.R; Investigation, O. O and P.A; Project administration, O. O and Z.R; Validation, N. M and N.J; Writing—original draft, O.O.; Writing—review & editing, O.O., P.A., N.M., Z. R, and N.J.
Corresponding author
Ethics declarations
Ethical approval
The authors do not require any ethical approval since no human or animal samples were involved in this study.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Olatunji, O.O., Adedeji, P.A., Madushele, N. et al. Evolutionary optimization of biogas production from food, fruit, and vegetable (FFV) waste. Biomass Conv. Bioref. 14, 12113–12125 (2024). https://doi.org/10.1007/s13399-023-04506-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13399-023-04506-0