1 Introduction

The textile, printing and dyeing industries are some of the largest producers of dye wastewater, contributing up to about 0.7 million metric tons of chemical dyes produced annually, accounting for 17 to 20% of water pollution worldwide (Pavlović et al. 2014). In Bangladesh, the textile sector currently exports nearly 28 billion USD annually, up to 82% of the country’s total export earnings (Hossain et al. 2018). In 2021, the textile industries in Bangladesh produced approximately 2.91 million metric tons of fabrics and around 349 million metric tons of wastewater generated from conventional dyeing practices (Hossain et al. 2018). Figure 1 represents the water and chemical consumption of the textile processing industry in Bangladesh.

Fig. 1
figure 1

Water and chemical consumption of textile processing industry in Bangladesh with a production capacity of 1,812 tons to 18,000 tons annually, amounting to 24-h shifts and 25 working days (Uddin et al. 2023)

1.1 Types of Textile Wastewater Treatment

Globally, about 60% of the annual output of synthetic dyes consists of azo compounds (Liu et al. 2022). These azo dyes possess stable azo function groups (N = N) and aromatic rings, which make it very chemically stable and highly resistant to environmental biodegradation and UV photolysis (Cui et al. 2021). These azo dyes have strong chromaticity, ecotoxicity and carcinogenicity, which pose a significant health risk and environmental hazard (Kumar and Gupta 2022). Residual dyestuffs are characterized by intense colour, high organic content, and highly stable chemical structure, with strong potential to cause serious environmental pollution (El-Kammah et al. 2022).

Various treatment methods are commonly employed to remove textile dyes from wastewater using biological degradation (Singh et al. 2022), coagulation-flocculation (Lau et al. 2014), membrane filtration (Wu et al. 2022a), Fenton reagent (Badmus et al. 2020) and photocatalytic degradation (Chairungsri et al. 2022). Still, they are ineffective due to excessive sludge production, secondary pollution and membrane fouling (Nidheesh et al. 2018). Excessive sludge production from biological treatment process requires additional post-treatment and waste management processes, resulting in large energy consumption and financial expenditure (Shoukat et al. 2019). On the other hand, membrane fouling is a significant issue in filtration process, which hampers filtration effectiveness (Wu et al. 2022a). Coagulation-flocculation process will contribute to secondary pollution due to chemical reagents used to remove the pollutants (Januário et al. 2021; Tahraoui et al. 2023). The recovery of chemical reagents is challenging, resulting in a loss of energy and resources (Ihaddaden et al. 2022). Catalyst poisoning and electron–hole recombination are significant issues in advanced oxidation processes, leading to reduced oxidation potential (Fu et al. 2023; Kanjal et al. 2023). The catalysts used in advanced oxidation processes are costly (Saravanan et al. 2022). Fenton’s reagent leads to a significant issue with iron sludge generation due to combined flocculation with the reagent and organic compounds (Mechati et al. 2023; Suhan et al. 2021). Additional pH adjustment is needed for Fenton’s reagent to facilitate oxidation, increasing operational costs (Can-Güven 2021; Kebir et al. 2023). Furthermore, a fluidized three-dimensional electrochemical oxidation process was used to treat MO wastewater and achieved a removal efficiency of 99.9% in 30 min, whereas the original adsorption capacity of activated carbon was maintained at 64.5% after 8 cycles of adsorption-electrochemical regeneration (Liu et al. 2022). Therefore, the literature recommended using electrochemical process as an advanced wastewater treatment technique to remove dyes from industrial effluent.

The electrochemical treatment is commonly used to eliminate pollutants on the anodic surface, via generation of.OH oxidants and active chlorine species (Hamida et al. 2022). However, few studies have been conducted on AI and machine learning-based optimization of three-dimensional electrochemical treatment of textile wastewater involving graphite intercalation compound (GIC) particle electrodes. The current disadvantages of using anodic oxidation technology such as boron-doped diamond and mixed metal oxide electrodes are due to poisoning of electrodes and buildup of biofilm or thin-oxide layer, which can decrease its electrocatalytic efficiency and service life by 10–90% (El Aggadi et al. 2021). These electrodes could not overcome the issues associated with mass transfer resistance and short half-life of oxidizing species with approximately 10–6 ~ 10–3 s in wastewater media (Chen et al. 2023; Xie et al. 2022). The current disadvantages of adsorbent materials made from agricultural sources and carbon-based substituents are non-regenerative and susceptible to heat stress or other physicochemical degradation (Vinayagam et al. 2022). On the other hand, granular activated carbon (GAC) has 10–20% lower regeneration efficiency than GIC, making it unsuitable for use in electrochemical reactor (Narbaitz and McEwen 2012; Narbaitz and Karimi-Jashni 2012). Hence, we aimed to improve the electrocatalytic efficiency of three-dimensional electrochemical reactor using an electrically regenerative particle electrode to achieve high mineralization efficiency of dye pollutants in wastewater.

More critically, there is a potential gap in comparing the prediction efficiency of various artificial intelligence and machine learning-based optimization approaches specific to three-dimensional electrochemical treatment process. Intelligent control of electrochemical nitrate removal was based on artificial neural network whereas electrochemical sensors were applied to monitor and remove azo dyes and food colorant substances. None of the past research explored the artificial intelligence and machine learning-based optimization techniques on three-dimensional electrochemical treatment of xenobiotic dye removal (Meng et al. 2022; Wu et al. 2022b). On the other hand, past research focused on using electrochemical conversion of ammonia into harmless nitrogen gas by utilizing granular activated carbon as three-dimensional particle electrode which was poorly regenerative or of low electrical conductivity compared to graphite intercalation compound (Zhang et al. 2024). Moreover, system perturbations, uncertainties and variability of operating parameters and their impact on targeted responses specific to three-dimensional electrochemical treatment process are never accounted for in the current literature. This involved exploring the uncertainties in AI optimization effect of operating parameters such as applied current density, electrolysis time and initial dye concentration, to improve the electrooxidation efficiency of the three-dimensional electrochemical reactor. Most significantly, the novelty of this research lies in finding the best artificial intelligence-based models to improve the prediction efficiency of complex phenomena by applying them to large physical, chemical and biological processes. Secondly, the research aims to develop accurate artificial intelligence-based models which can be integrated into the upscaled conventional wastewater treatment systems to enhance value engineering, water resources management, energy efficiency, real-time process dynamics, data controllability and streamlining distributed network of process control systems. Unlike other past research, this research also aims to scrutinize the prediction efficiency of different artificial intelligence and machine learning-based models by analysing the level of uncertainties or the effect of various operating parameters on system perturbations and variability to enhance the accuracy and precision of predictive model platforms.

2 Materials and Methods

2.1 Experimental Equipment and Materials

Methyl orange (\({\mathrm{C}}_{14}{\mathrm{H}}_{14}{\mathrm{N}}_{3}{\mathrm{NaO}}_{3}\mathrm{S}\)) was a chemical reagent grade obtained from Chem-Supply, Australia. Commercial GIC was purchased from Sigma-Aldrich, Australia. The particle size of GIC was greater than 300 μm (50 mesh). GIC has an electrical conductivity of approximately 0.8 S/cm. The MO solution was prepared using high purity distilled water. UV/Visible spectrophotometer (DR6000, Hach) was used to determine the MO dye concentrations in solution at different time intervals. The maximum absorption occurred at a wavelength λ = 463 nm. The coefficient of variation (COV) for the UV-absorbance analysis of MO was approximately 3.08%, whereas for the TOC analysis (TOC-V CSH, Shimadzu), it was approximately 0.55%.

The experiment was performed in a 6–7 L electrochemical reactor equipped with anode and cathode. A more detailed description can be found in Trzcinski and Harada (2023). Graphite plate anode with approximately 70 cm2 of electroactive surface area and stainless steel 316 cathode were connected with a 60 V DC power supply unit (Model GPR-6030D, GW INSTEK, Taiwan) to form a closed-looped electrical circuit. Compressed air at 2 bar was sparged into the anodic compartment of the reactor to mix GIC and contaminated water. A solution of 0.3% (w/v) of NaCl adjusted to pH 2 using HCl was used as the supporting electrolyte.

2.2 MO Adsorption and Electrochemical Oxidation Process

In 3D electrochemical process, 1-L of MO-contaminated water was first added into the reactor, and air pressure was set at 2 bar to start the adsorption process. After the adsorption process was over at 20 min, GIC particle electrodes were allowed to settle down in the regeneration zone between the cathode and the anode. The regeneration zone is located within the anodic compartment where the GIC particle electrodes are electrochemically regenerated when subjected to electrolysis. The current supply ranged from 1.05 to 3.16 A, corresponding to a current density of 15 to 35 mA/cm2 applied for 10 min. Mathematical equations used for characterising three-dimensional process are outlined in subsection 2.2.1.

2.2.1 Mathematical Equations for the Electrochemical Process

In the study of 3D electrochemical process, a pseudo-first-order kinetic model (Eq. 1) was used to describe the change of concentration over time, and the pseudo-first-order reaction rate constant represents the electrooxidation kinetics of MO removal by 3D process. Alternatively, t1/2 represents the half-life of mineralization rate for 50% of MO pollutants to degrade in an aqueous solution. The combined adsorption and electrochemical oxidation process synergistically maximise the dye and TOC removal efficiencies of MO in aqueous solutions. The pseudo-first-order kinetic rate constant representing the electrooxidation kinetics can be determined from the following equation (Liu et al. 2022):

$$-{\mathrm{log}}_{\mathrm{e}}\left(\frac{{\mathrm{C}}_{\mathrm{t}}}{{\mathrm{C}}_{0}}\right)=\mathrm{kt}$$
(1)

where k represents the kinetic rate constant in min−1; C0 represents the initial dye concentration from 50 mg/L to 125 mg/L; Ct represents the final dye concentration changes according to time after a period of adsorption and electrochemical oxidation; and t is the time in min.

The regeneration efficiency, RE, can be calculated from the following equation:

$$\mathrm{RE}=\frac{{\mathrm{q}}_{\mathrm{r}}}{{\mathrm{q}}_{\mathrm{i}}}\times 100\mathrm{\%}$$
(2)

where qi represents the initial loading of MO (mg/g) onto fresh GIC adsorbent; and qr represents the final loading (mg/g) on the regenerated GIC adsorbent under identical adsorption conditions:

$${\mathrm{q}}_{\mathrm{i}}=\frac{\left({\mathrm{C}}_{0}-{\mathrm{C}}_{\mathrm{i}}\right)\mathrm{V}}{\mathrm{m}}$$
(3)
$${\mathrm{q}}_{\mathrm{r}}=\frac{\left({\mathrm{C}}_{0}-{\mathrm{C}}_{\mathrm{t}}\right)\mathrm{V}}{\mathrm{m}}$$
(4)

where C0 denotes the initial dye concentration (mg/L), Ci represents the dye concentration (mg/L) after adsorption but before electrochemical regeneration, Ct represents the dye concentration (mg/L) after electrochemical regeneration, and t is the regeneration time.

The charge passed per gram of GIC adsorbent is given by the following relationship:

$${\mathrm{Q}}_{\mathrm{t}}=\frac{\mathrm{It}}{\mathrm{m}}$$
(5)

where I is the applied current (A); t is the electrolysis time (min); and m is the mass of GIC adsorbent (g).

To calculate the applied current density, JEO, the following equation is used:

$${\mathrm{J}}_{\mathrm{EO}}=\frac{\mathrm{I}}{\mathrm{SA}}$$
(6)

where I denotes the current applied (A); and SA denotes the surface area of the anode, which was 70 cm2.

To calculate the electrical energy consumption per kg of adsorbed MO, the equation is shown below:

$$\mathrm{EC}\left(\frac{\mathrm{kWh}}{\mathrm{kg}}\right)=\frac{\mathrm{I}\times \mathrm{V}\times \mathrm{t}}{\left({\mathrm{C}}_{0}-{\mathrm{C}}_{\mathrm{t}}\right)\mathrm{V}}$$
(7)

where I is the applied current (A); Ut is the cell potential at time t (V); and V is the MO solution volume, which was 1.0 L.

To calculate the electrical energy consumption per kg TOC of adsorbed MO, the following equation is used:

$$\mathrm{EC}\left(\frac{\mathrm{kWh}}{\mathrm{kg TOC}}\right)=\frac{\mathrm{I}\times \mathrm{V}\times \mathrm{t}}{\left({\mathrm{TOC}}_{0}-{\mathrm{TOC}}_{\mathrm{t}}\right)\mathrm{V}}$$
(8)

where I is the applied current (A); Ut is cell potential at time t (V); V is MO solution volume (1.0 L); t is the time (min); and TOC0, TOCt are TOC concentrations initial and final total organic carbon concentrations in mg/L, respectively at time t (min).

Based on the actual charge passed per gram for the equation above, the theoretical equation is as follows:

$${\mathrm{Q}}_{\mathrm{th}}=\frac{\mathrm{n}\left({\mathrm{C}}_{0}-{\mathrm{C}}_{\mathrm{f}}\right)\mathrm{VF}}{{\mathrm{M}}_{\mathrm{w}}}$$
(9)

where Ci and Cf are the initial and final MO concentrations in solution taken before and after the adsorption-electrochemical regeneration for 5 cycles; V is the solution volume (L); F is Faraday’s constant (96,487 C mol−1); Mw is the molecular weight of MO is 327.33 g mol−1; and n represents the number of electrons, which is 90 for complete oxidation and 36 for incomplete oxidation (see Sect. 3.1).

The current efficiency equation is as follows:

$$\mathrm{Current efficiency}\left(\mathrm{\%}\right)=\left(\frac{\mathrm{Theoretical Charge}}{\mathrm{Actual Charge Passed}}\right)\times 100\mathrm{\%}$$
(10)

After the graphs were generated using the experimental data, various AI and machine learning optimisation techniques were applied to compare any deviation between the experimental and optimised values. Subsection 2.3. briefly summarises the data analysis methods used for AI and machine learning optimisation techniques.

2.3 Data Analysis Methods

Before performing the optimization of experimental data using Artificial Neural Networks (ANN), Support Vector Machine (SVM) and random forest (RF), an approximate model must be developed in preparation for training and testing. To configure the dataset, designing the network architecture of any AI model is critical to incorporating activation functions, transfer functions, nodes or layers, etc. Once the model architecture is created, model training and testing procedures must be performed to train and test the network architecture to evaluate the model performance. The training and testing procedures are critical to improve the generalization of predictive performance. In these procedures, the input operating parameters from the experimental data were transferred into the activation functions of the modelled network architecture to generate output response variables. The tested experimental results were compared with the AI or machine learning optimised results to derive any error deviation using statistical analysis for data fitting purposes involving the use of either ANN, SVM classifier model or random forest decision trees to enhance the predictive outcomes.

2.4 Electrochemical Regeneration of GIC

This electrochemical regeneration experiment was subdivided into three phases:

  1. 1)

    Initial adsorption: Air was sparged into the reactor containing 200 g of GIC particle electrodes for 20 min. The air pressure was 2 bar to facilitate the mass transfer of dye molecules onto the particle electrodes. After 20 min, the air supply was turned off to allow the GIC particle electrodes to settle onto the bottom of the anodic compartment.

  2. 2)

    Adsorption-electrochemical regeneration phase: A DC power source supplied a fixed current through the cell during electrochemical regeneration. The electric field was turned on for 10 min to facilitate the electrochemical regeneration of GIC.

  3. 3)

    Next cycles of adsorption-electrochemical regeneration: The air was turned off, allowing the particle electrodes to settle onto the regeneration zone of the electrochemical cell. The remaining electrochemically treated solution was drained off. A fresh dye solution was added for the next round of adsorption-electrochemical regeneration.

2.5 Analytical Methods

In the following experiment, 1,000 mL of the 50–250 mg/L of MO stock solution was subjected to electrochemical treatment. Experiments were carried out at a temperature of 22 ⁰C, and the dye solutions were filtered using a 5 μm filter funnel. A 5,000 μL "Eppendorf" syringe was used to take the liquid samples from the dye solutions at intervals ranging from 0 to 30 min. These liquid samples were analysed using a UV/Visible Spectrophotometer (λm = 463 nm, Hach DR6000) and a TOC analyser (Shimadzu TOC-V CSH) to determine the dye and TOC concentrations throughout the electrochemical treatment. The coefficient of variation (COV) for the UV-absorbance analysis of MO is approximately 3.08%, whereas for the TOC analysis, it is approximately 0.55%.

2.6 AI Modelling and Optimization

2.6.1 ANN Procedure

ANN is widely used to solve complex, multivariate and non-linear problems via classification and regression modelling (Khan et al. 2022). ANN optimization method was applied to model and predict responses influenced by operational variables. ANN is a subset of machine learning algorithms (Oruganti et al. 2023). It mimics the behaviour of human brain and nervous system with outstanding learning ability. ANN is a black-box model that employs a gradient descent propagation technique to predict a target output variable (Picos-Benítez et al. 2020). It is structured into three layers, each node connected by inputs and outputs, as shown in Fig. 2. The ANN processes involve one or more hidden layers connected by input parameters consisting of current density, electrolysis time and initial MO concentration, and output layers consisting of MO removal efficiency, current efficiency, electrical energy consumption of MO and TOC, which is known as the multilayer perceptron (MLP) structure (Asgari et al. 2020). The number of neurons in each input and output layer can be as many as the number of input and response variables. In this study, a three-layer ANN model with a hidden layer was designed, in which the tangential sigmoid function was used at the hidden layer, whereas a linear transfer function was used at the output layer. The Levenberg–Marquardt backpropagation algorithm with 1000 epochs was employed for training the network. The number of neurons located in the hidden layer was a range of 1–20 to give the best optimum values based on minimum mean squared error (MSE). The ANN analysis was performed using MATLAB R2023a. The performance of ANN modelling can be statistically evaluated using the MSE and the correlation coefficient, in accordance with the following Eqs. (11) and (12), respectively (Khan et al. 2022; Özdoğan-Sarıkoç et al. 2023):

$$\mathrm{MSE}=\frac1{\mathrm{N}}{\textstyle\sum \nolimits_ {\mathrm i=1}^{\mathrm N}}\left(\left|{\mathrm{y}}_{\mathrm{pred},\mathrm{i}}-{\mathrm{y}}_{\mathrm{exp},\mathrm{i}}\right|\right)^2$$
(11)
$$\mathrm{R}^2=1-{\textstyle\sum \nolimits_ {\mathrm i=1}^{\mathrm N}}\frac{\left({\mathrm{y}}_{\mathrm{i},\mathrm{cal}}-{\mathrm{y}}_{\mathrm{i},\mathrm{exp}}\right)^2}{\left(\overline{{\mathrm{y}}_\mathrm{exp}}-{\mathrm{y}}_{\mathrm{i},\mathrm{exp}}\right)^2}$$
(12)

\({\mathrm{y}}_{\mathrm{pred},\mathrm{i}}\) and \({\mathrm{y}}_{\mathrm{exp},\mathrm{i}}\) denote predicted and experimental ith values in scalar unit such as dye or TOC removal efficiency, respectively. \(\overline{{\mathrm{y} }_{\mathrm{exp}}}\) represents an average experimental value of either dye or TOC removal efficiency. MSE and \({\mathrm{R}}^{2}\) are mean square error and coefficient of determination, respectively.

Fig. 2
figure 2

ANN network with topology. ANN operates like a human brain and nervous system. It possesses one or more hidden layers, input and output layers, which are known as multilayer perceptron (MLP) structures. The neurons in the input are feedforwarded through the hidden layers to the output layers, representing the response variables. The Levenberg–Marquardt backpropagation algorithm is adopted to train the network

2.6.2 SVM Procedure

The SVM method is built upon the fundamental concept that involves applying either a linear or non-linear mapping function to map the experimental or actual data into a higher dimensional feature space and search for an optimum hyperplane in the new space to achieve classification of samples (Ding et al. 2023). The SVM algorithm and its regression models have faster training time and are more advantageous than the ANN models in finding the universal optimal solutions for a given experimental dataset (Özdoğan-Sarıkoç et al. 2023). The support vector regression (SVR) algorithm can be extracted from the SVM algorithm to predict response variables. However, given the limited predictability of the ANN algorithm, the radial basis of ANN function was still dominant compared to the SVM algorithm (Safeer et al. 2022). Moreover, SVM helps to identify patterns and/or classify the specific dataset. It compares the differences between the predicted and experimental values, providing information on the degree of fitness. The primary goal of SVM algorithm is to identify the hyperplane in an N-dimensional space that classifies distinct datasets (Singh et al. 2023). There are a number of features that define the hyperplane. However, as the number of features increases, the complexity of model also increases, making it more challenging to comprehend. When combined with ANN model, the interpretation of complex model becomes more manageable. The predictive performance indicator of SVM model is used in AI optimization as follows (Khan et al. 2022):

$$\mathrm{MSE}=\frac1{\mathrm{n}}{\textstyle\sum \nolimits_ {\mathrm i=1}^{\mathrm n}}\left({\mathrm{y}}_{\mathrm{i},\mathrm{cal}}-{\mathrm{y}}_{\mathrm{i},\mathrm{exp}}\right)^2$$
(13)
$$\mathrm{RMSE}=\sqrt{\frac{\sum_{\mathrm{i}=1}^{\mathrm{n}}{\left({\mathrm{Z}}_{\mathrm{i}}-{\mathrm{Y}}_{\mathrm{i}}\right)}^{2}}{\mathrm{n}}}$$
(14)

\({\mathrm{y}}_{\mathrm{i},\mathrm{cal}}\) and \({\mathrm{y}}_{\mathrm{exp},\mathrm{i}}\) denote calculated and experimental ith values in scalar unit, such as dye or TOC removal efficiency, respectively. \({\mathrm{Z}}_{\mathrm{i}}\) and Yi denote predicted and experimental ith values, such as dye or TOC removal efficiency.

SVM is a regression model that requires a decision boundary involving a maximum-margin hyperplane to solve a learning sample (Wang et al. 2022). To perform curve fitting, the conceptual relationship of SVM and Lagrange multiplier method involves regression analysis of the data. This relationship can be described using a functional equation of the regression as follows (Wang et al. 2022):

$$\mathrm{f}\left(\mathrm{x}\right)=\upomega \cdot\upphi \left(\mathrm{x}\right)+\mathrm{b}$$
(15)

where x represents the input vector; ω, b: the parameter vector; ϕ(x): the characteristic function. In addition, \(\upphi :\mathrm{X}\to\upphi \left(\mathrm{X}\right)\in {\mathrm{R}}^{\mathrm{H}}\) is any non-linear function that maps the input experimental data into a high-dimensional feature space (Rodriguez-Galiano et al. 2015).

The model optimisation was subjected to the soft-margin constraint involving hyperplane, distinguishing the training data with the maximum margin. The optimization problem can be solved using the Lagrange multipliers method, which is the Kernel function defined as the inner product of the transformed input feature vectors (Rodriguez-Galiano et al. 2015):

$$\mathrm{K}\left({\mathrm{x}}_\mathrm{i},{\mathrm{x}}_\mathrm{j}\right)=\langle\phi\left({\mathrm{x}}_\mathrm{i}\right)\vert\phi\left({\mathrm{x}}_\mathrm{j}\right)\rangle$$
(16)

2.6.3 Random Forest

Random forest is essentially a Classification and Regression Trees (CART) algorithm, which is part of a machine learning-based approach with the potential to capture complex non-linear relationships between selected models (Wang et al. 2022). Random forest utilizes multiple trees with nodes to train and predict samples, with representation by decision trees. The chosen training data are randomly returned, and newly learned data is continuously constructed, resulting in newly established decision trees to increase the overall effect of accuracy and stability of predictions. For solving regression problems, the random forest generates a final prediction result for each decision tree based on the mean of the predicted data.

2.7 Statistical Analysis and Data Fitting Using AI Models

2.7.1 Development of ANN Architecture

All operational parameters used in ANN approach were adopted from the experimental data. In addition, the desired output responses were MO removal efficiency, electrical energy consumption of MO and TOC, and current efficiency of the electrochemical reactor. Firstly, it was assumed that artificial neurons are arranged in sequential layers. Secondly, the neurons within the same layers do not interact with one another. Thirdly, all input operating parameters entering the network architecture must pass from the input layer through the hidden layer to the output layer. All hidden layers must have a similar activation or transfer function. Once the output variables are generated, they are compared with the input variables using statistical analyses involving MSE, RMSE, R2, etc.

The proposed mathematical equation representing the ANN model can be written as follows (Asgari et al. 2020):

$${\mathrm Y}_{\mathrm n}={\mathrm f}_0\left\{{\mathrm a}_0+{\textstyle\sum \nolimits_ {\mathrm k=1}^{\mathrm h}}\left[{\mathrm w}_{\mathrm k}\times{\mathrm f}_{\mathrm h}\left({\mathrm a}_{\mathrm{hk}}+{\textstyle\sum \nolimits_ {\mathrm i=1}^{\mathrm m}}{\mathrm j}_{\mathrm{ik}}{\mathrm X}_{\mathrm{ni}}\right)\right]\right\}$$
(17)

where Yn represents the normalized response variable, f0 denotes the transfer function in the output layer, b0 is the bias value in the output layer, wk is the weights between the output and hidden layers, fh is the transfer function representing the tan-sigmoid function in a specific study in the hidden layer, ahk is bias value in the hidden layer, jik represents the weights involved between the hidden and input layers, and Xni denotes the normalized input variables ranging between 0.1 to 0.9 for a specific study.

2.7.2 Multiple Regression Analysis

Multiple regression analysis is one of the statistical techniques used to analyse the relationship between a single dependent variable and a range of independent variables. The primary purpose of using multiple regression analysis is to use independent variables to predict the value of a single dependent variable (Wagner et al. 2006). Each predictor is weighed, with total weights contributing to the overall prediction. The following represents the equation for describing the overall prediction (Wagner et al. 2006):

$$\mathrm Y=\mathrm a+{\mathrm b}_1{\mathrm X}_1+{\mathrm b}_2{\mathrm X}_2+\dots+{\mathrm b}_{\mathrm n}{\mathrm X}_{\mathrm n}$$
(18)

where Y denotes the dependent variable; X1 and Xn represent the number of independent variables; b1 and bn represent the weights to ensure maximum prediction of dependent variable from the set of independent variables.

3 Results and Discussion

3.1 Effect of the Operational Parameters on the Electrochemical Process

Current density was one of the most influential parameters affecting the overall electrochemical treatment efficiency. The experiment studied the effect of 15 mA/cm2 of current density on the degradation efficiency of MO by 3D electrochemical process. In addition, Fig. 3 shows that when the current density of 15 mA/cm2 was applied for at least 30 min of electrolysis time to treat a range of initial MO concentrations ranging from 50 to 125 mg/L, the MO removal rate constants changed from 0.149 to 0.036 min−1 while MO removal efficiency decreased from 98.8% to 66.0%. Approximately 70% (0.046 min−1) and 90% (0.241 min−1) of removal efficiencies and removal rate constants were achieved in 2D and 3D electrochemical treatment by Liu et al. (2022). The results indicate that the higher the initial MO concentration, the lower the MO removal efficiency and removal rate constants due to competitive reaction between.OH and dye pollutants. There were two types of oxidation reaction: 1) direct anodic oxidation of MO pollutants via anode; 2) indirect oxidation of MO via powerful oxidants such as hydroxyl radical and active chlorine species electrogenerated in bulk solution, anode and particle electrode surfaces. In addition, Fig. 3 shows that the applied current density of 15 mA/cm2 increased the regeneration efficiency of GIC particles beyond 100% after a few adsorption-regeneration cycles. This means that after 5 cycles, all the adsorbed MO was degraded, leaving the GIC with fully recovered active sites. In addition, the propagation of error was calculated to determine the effects of function by variable uncertainty to provide a more accurate measurement of uncertainty. In this case, the uncertainty propagation for regeneration efficiencies was approximately 15.3%. This value indicates that the effect of electrochemical regeneration on active site recovery on GIC particle electrodes was not significantly different in each cycle of adsorption-regeneration, and almost equal proportion or approximately 100% of active sites can be recovered after electrochemical treatment. Secondly, the result also indicated that the effect of electrochemical regeneration on the surface roughening of GIC particle electrodes was minimal as prolonged regeneration can affect the physicochemical properties of particle electrodes, offsetting the recovery of active sites for better regeneration and adsorption efficiencies.

Fig. 3
figure 3

Effect of 15 mA/cm2 of current density on the mineralisation rate and MO removal efficiency. The significance of this result is that higher current density is required to completely mineralise large amount of MO pollutants in higher concentrations. The higher half-life of MO pollutants indicates that not all dyes are completely mineralised, leaving them in the aqueous solution

This phenomenon was attributed to surface roughening, which led to changes in surface chemistry or physicochemical properties of GIC (Nkrumah-Amoako et al. 2014). Past researchers showed that the surface area of GIC was expanded during the electrochemical regeneration process (Nkrumah-Amoako et al. 2014). During the electrochemical oxidation process, MO pollutants were adsorbed onto the GIC particle electrodes and oxidized on its electroactive surface into intermediate byproducts. The electrogenerated hydroxyl radicals from the water-splitting process and active chlorine species formed during the electrolytic process helped degrade the MO pollutants through indirect oxidation. In contrast, the direct oxidation of MO pollutants occurred on the surface of anodic material (Martínez-Huitle and Ferro 2006). Hydroxyl radicals formed on the surface of the anodic material by physisorption were released into the bulk liquid media to degrade the MO pollutants.

Notwithstanding the effect of physisorption, Fig. 4 shows that the regeneration efficiency of GIC adsorbent also played a significant role in recovering the surface-active sites. The regeneration efficiency was influenced by surface roughening of the GIC particle electrodes, resulting in changes in physicochemical properties. Figure 5 shows that MO and TOC concentrations decreased significantly when the current density increased from 15 to 35 mA/cm2. However, when 15 to 35 mA/cm2 of current densities were applied to treat the initial MO concentration of 50 mg/L, this significantly decreased MO and TOC concentrations. The competitive reaction of.OH oxidants and active chlorine species with MO pollutants affected the amount of highly reactive oxidizing species available to mineralise the organic pollutants completely.

Fig. 4
figure 4

Effect of 15 mA/cm2 of current density on the regeneration efficiency of GIC particle electrodes. The significance of this result is that electrolysis leads to changes in GIC physicochemical properties, causing surface roughening and surface area recovery or availability for more adsorption due to high regeneration efficiency, thereby improving the uptake of MO pollutants

Fig. 5
figure 5

Effect of changing current densities and electrolysis time on different MO and TOC concentrations. The significance of this result is that the greater the current density, the greater the mineralisation efficiency, resulting in a significant decrease in MO and TOC concentrations over electrolysis time

The electrochemical oxidation mechanism of organic pollutants via highly reactive hydroxyl radicals using anode is the following:

$$\mathrm{M}\left(\mathrm{s}\right)+{\mathrm{H}}_{2}\mathrm{O}\left(\mathrm{l}\right)\to \mathrm{M}\left(\cdot \mathrm{OH}\right)+{\mathrm{H}}^{+}\left(\mathrm{aq}\right)+{\mathrm{e}}^{-}$$
(19)
$$\mathrm{M}\left(\cdot \mathrm{OH}\right)\to \mathrm{MO}\left(\mathrm{s}\right)+{\mathrm{H}}^{+}\left(\mathrm{aq}\right)+{\mathrm{e}}^{-}$$
(20)
$$\mathrm{MO}\left(\mathrm{s}\right)+\mathrm{P}\to \mathrm{M}\left(\mathrm{s}\right)+\mathrm{PO}$$
(21)

where P denotes pollutants; M denotes metal oxide electrodes.

The hydroxyl radical is one of the most potent oxidants in an aqueous solution with E0 = 2.73 V/SHE, which can be electrogenerated on the surface of the electrode (Serrano 2021). It is desirable to have a weak interaction between the radical and electrode surface to make reactivity with the nearby pollutant species possible. The physisorption process depends on the strength of the interaction of hydroxyl radicals with the electrode surface. Attractive electrostatic forces mainly involve van der Waals' forces, which are more vulnerable than a covalent bond. Although the radical species is highly reactive, it has a half-life of approximately 10 ns (Serrano 2021). Hydroxyl radicals can be either physisorbed or chemisorbed onto the electrode. If the chemisorption is predominantly strong, it will hinder the mass transfer of hydroxyl radicals into the bulk solution, reducing the oxidation potential of the electrochemical system.

Active chlorine species are often present with hydroxyl radicals, especially in an electrochemical system that uses NaCl as an electrolyte species. H+ ions lead to increased acidity of treated wastewater, but it positively affects sustaining hydroxyl radicals and active chlorine species. On the other hand, high current density exacerbates the side reactions, resulting in reduced treatment efficiency.

In addition, the cathodic half-reaction for active chlorine species and water electrolysis for an electrochemical reaction is as follows:

$$2{\mathrm{H}}_{2}\mathrm{O }\left(\mathrm{l}\right)+4{\mathrm{e}}^{-} \to 2{\mathrm{OH}}^{-} \left(\mathrm{l}\right)+ {\mathrm{H}}_{2} (\mathrm{g})$$
(22)
$${}^-\mathrm{OCl}(\mathrm l)+{\mathrm H}_2\mathrm O(\mathrm l)+2\mathrm e^-\rightarrow \mathrm{Cl}^-(\mathrm l)+2\mathrm{OH}^-(\mathrm l)$$
(23)

The presence of chloride and hydroxide species increases alkalinity in the catholyte solution.

During electrochemical process, assuming that the nitrogen and sulfur atoms in MO are converted into nitrate and sulfate, the complete electrochemical oxidation reaction of MO is given by the equation as follows:

$$C_{14}H_{14}N_3NaO_3S(s)+38H_2O(1)\rightarrow 14CO_2(g)+90H^+(aq)+Na^+(aq)+SO_4^{2-}(aq)+3NO_3^-(aq)+90^{e-}$$
(24)

For incomplete electrochemical oxidation reaction of MO due to the influence of side reactions, the equation is as follows:

$$C_{14}H_{14}N_3NaO_3S(s)+11H_2O(1)\rightarrow 14CO(g)+36H^+(aq)+Na^+(aq)+S^{2-}(aq)+\frac32N_2(aq)+36e^{-}$$
(25)

Judging from Eqs. (24) and (25), both complete and incomplete oxidation reactions influence the MO and TOC removal efficiencies. Figure 6 shows the differences between the effects of complete and incomplete oxidation reactions on current efficiency of 3D electrochemical process. Complete oxidation reaction of MO by.OH oxidants resulted in higher current efficiency than incomplete oxidation reaction. This phenomenon was caused by greater utilization efficiency of current to generate powerful oxidants such as hydroxyl radicals and active chlorine species to degrade MO pollutants in aqueous solutions. However, when the current density was increased from 15 to 35 mA/cm2, the current efficiency decreased significantly for all initial dye concentrations. The result indicated that the formation of side reactions produced a significant amount of intermediate transformation byproducts, which offset the current efficiency.

Fig. 6
figure 6

Effect of different current densities on current efficiency of the 3D electrochemical reactor. The significance of this result is that not all current potentials are utilised efficiently to mineralise the MO pollutants. Some currents were lost through the buildup of side reactions or quenching effect of surrounding media and interference from intermediate transformation byproducts

In electrochemical process, electrical energy consumption is a critically important parameter. Electrical conductivity of the MO solution and GIC particle electrode directly influenced the energy consumption of the 3D electrochemical reactor. Therefore, enhancing electrical conductivity by integrating the electrochemical reactor with electrically conductive GIC particle electrodes can decrease the solution's electrical and mass transfer resistances, leading to better voltage utilization when fixing an electric current. The existence of ions such as nitrate, ammonium and sulphate ions provided electrical conductivity in the solution. Mineralisation of MO pollutants was accompanied by the evolution of \({\mathrm{NH}}_{4}^{+}\), \(\mathrm{NO}_3^-\)and \(\mathrm{SO}_{4}^{2-}\). In addition, the electrogenerated oxidant species may lead to corrosion of the electrodes, inadvertently increasing the electrical resistance. The increase in ohmic resistance of the electrode due to corrosion may result in additional maintenance and repair costs after prolonged electrochemical treatment. Moreover, the results from Fig. 6 showed that the current efficiency significantly impacted the utilisation efficiency of current, directly influencing the amount of energy channelled into the degradation of dye contaminants. The differences between the complete and incomplete oxidation reactions were due to differences in the number of coulombic electrons produced. Complete oxidation reactions were considered ideal reactions with more electrons yielded as presented by Eq. (24). On the other hand, incomplete oxidation reactions involved some loss of electrons due to inefficient reactions and desirable uptake of electrons due to the quenching effect of surrounding media. In addition, Fig. 7a shows that when the current density increased from 15 to 35 mA/cm2 for all initial MO concentrations, the electrical energy consumption increased from 5 kWh/kg MO to greater than 30 kWh/kg MO. On the other hand, Fig. 7b shows that the electrical energy consumption for TOC removal increased more significantly than the electrooxidation of MO pollutants due to greater electrical energy required to achieve complete mineralisation efficiency. In addition, the values of electrical energy consumption of TOC were more critical and reflective of the actual breakdown of dye contaminants into CO2 and H2O, representing the complete reduction of dye contaminants to prevent it from forming aromatic amines, which could be more ecotoxic than its original organic compound.

Fig. 7
figure 7

a) Effect of different current densities and initial MO concentrations on electrical energy consumption (kWh/kg MO) of 3D electrochemical reactor; b) Effect of different current densities and initial MO concentrations on electrical energy consumption (kWh/kg TOC) of 3D electrochemical reactor. This result shows that higher electrical energy is required to completely mineralise the MO pollutants compared to the lower electrical energy needed to break down or convert the MO pollutants into intermediate transformation byproducts through different oxidation pathways. Side reactions may influence the amount of electrical energy consumption

3.1.1 The Prediction Efficiency of Multi-regression Analysis, ANN and SVM models

To assess the prediction efficiency of ANN model in relation to multiple regression analysis and SVM models, 14 experimental runs were conducted for each set of current density, initial MO concentration and electrolysis time. The ANN prediction results for the experimental and predicted removal efficiencies demonstrated that the models yielded a promising result, with experimental values remarkably close to the predicted data as shown in Fig. 8a-d. Similarly, the prediction results for electrical energy consumption of MO and TOC and current efficiency showed high R2 values between the experimental and predicted values, highlighting the robustness of ANN optimization power to provide accurate predictions. Figures 8a-d showed different training, validation, and testing proportions, and all data were randomly segregated and imported into the ANN model. The efficiency of MSE calculation depended on the number of neurons applied in the hidden layer so that the statistical metric could be evaluated. The statistical analyses were based on the parameterized hypotheses between the experimental and AI-generated data. The value of MSE trained network was 22.44, along with the correlation coefficient (R2 = 0.992), as shown in Fig. 8a and e. The degree of curve fitting and its relationship between experimental and predicted responses were demonstrated by R2. The R2 values obtained for the training, validation and testing were 0.992, 0.965 and 0.845, respectively as shown in Fig. 8a. The R2-value close to 1 indicates a satisfactory relationship between outputs and target values. The linear fitting model attained plotting regression outputs of ANN, which were given in Fig. 8a-d. The plot of validation outputs and targets created the model (output = 0.72*Target + 15) in Fig. 8a. Figures 8a-d show a good correlation between the experimental and theoretical results obtained using the training function. Furthermore, the ANN topology was examined by varying the number of layers and neurons at the hidden layer to yield an optimal solution. In other words, the number of hidden layers was determined by trial-and-error methodology. Statistical metrics were used as evaluation criteria to determine the best optimal result with minimal deviation between the response variables in the experimental and theoretical results. The prediction capability of ANN did not increase with the number of neurons in the hidden layer due to overfitting of data, leading to increased error deviation and variability. In addition, one of the prediction results of MO removal efficiency showed that the R2 values for training, validation, testing and all data were 0.992, 0.965, 0.845 and 0.909 in Fig. 8a-d. These results indicated remarkable compatibility between the experimental and predicted results using the ANN model. Furthermore, Levenberg Marquardt Post-Diffusion Algorithm (LMPA) was used to train the network. The performance plot of the trained network is shown in Fig. 8e-h, which showed that the training stopped at 0.0713 at epoch 100 in Fig. 8f, which was close to the acceptable range. In general, the function estimation with network parameters less than 100, the LMP tends to show higher efficiency and speed of calculation. However, high accuracy is still significantly prominent in the majority of cases. The benefit of using this algorithm is due to minimal error. During the data training, the output predicted by the model was comparatively better than the expected value, which can be observed when the MSE values are calculated.

Fig. 8
figure 8figure 8figure 8

The performance of ANN models with topology for training, validation, test and all data for a) MO removal efficiency; b) electrical energy consumption (kWh/kg MO); c) electrical energy consumption (kWh/kg TOC); d) current efficiency; e) mean square error of validation performance for MO removal efficiency; f) mean square error of validation performance for electrical energy consumption (kWh/kg MO); g) mean square error of validation performance for electrical energy consumption (kWh/kg TOC); h) mean square error of validation performance for current efficiency; i) SVM prediction efficiency between the experimental and predicted data for MO removal efficiency; j) SVM prediction efficiency between the experimental and predicted data for electrical energy consumption of MO; k) SVM prediction efficiency between the experimental and predicted data for electrical energy consumption of TOC; l) ANN prediction efficiency between the experimental and predicted data for MO removal efficiency; m) RF prediction efficiency between the experimental and predicted data for MO removal efficiency; n) ANOVA analysis prediction efficiency between the experimental and predicted data for MO removal efficiency. The significance of the result is that when the ANN algorithm yields robust prediction efficiency of response variables such as MO and TOC removal efficiencies that can be improved significantly, rectifying the imprecise or complicated data to derive and extract patterns by controlling the operating parameters to minimise system perturbations or errors

During the first phase, the error training decreased until the network approached a minimal error, and by supplying more data, the error increased again. The network training was halted at this stage, and weights were returned to the minimum error. In addition, Fig. 8e to h showed the statistical significance and error distribution (MSE) of MO removal efficiency, electrical energy consumption of MO and TOC, and current efficiency, predicted by ANN model. The MSE values were significantly low coupled with high R2 values determined the goodness of measured and predicted results. Although linear relationships between the experimental and predicted results showed a good fit, it provided limited information on the model prediction efficiency due to the absence of non-linear multiple regression analysis. Furthermore, Fig. 8f and h showed that the MSE values were the lowest compared to other statistical metrics in the number of neurons contained within the hidden layer. The relationship between the experimental and predicted data can be evaluated using the correlation coefficient. On the other hand, Fig. 8i-k showed the equivalent prediction results between the experimental and predicted values, indicating that SVM algorithm can be used to strengthen the optimization power of ANN model. In other words, the SVM model yielded one of the best fitness between the experimental and predicted values.

The nature of surrounding media and quenching effect of ions on hydroxyl radicals and active chlorine species within the bulk solution influenced the synergistic adsorption and electrochemical oxidation of MO pollutants in wastewater. Some minor offset in the response variables predicted by the ANN models and experimental data stemmed from side reactions and interference from the immediate transformation byproducts from different oxidation pathways, resulting in slightly reduced correlation coefficients. In addition, the electrolytic effect of anode and cathode on the surface physicochemical properties of the GIC adsorbent can impact the adsorption capacity, increasing the surface area availability for further uptake of MO pollutants from the bulk solution. The surface functionalisation of GIC adsorbent also played a critical role in imparting electrostatic attraction between the MO pollutants and adsorbents. Moreover, the normalisation of experimental and ANN-predicted data shown in Fig. 8a-d indicated that the trained network was applied throughout the dataset, evidencing no misleading interpretation of the results. The minor deviation between the experimental and ANN-predicted data was partly due to experimental variability. Still, the entire experimental dataset yielded a high correlation coefficient with small MSE values, indicating that the ANN optimisation technique was efficient.

Table 1 lists the MSE/RMSE and R2 obtained from Fig. 8 and compares them with values from the literature. It can be observed that MSE/RMSE and R2 values for the response variable MO removal efficiency were approximately similar to the best AI optimization results achieved by other researchers, albeit experimenting on different pollutants. This result shows that AI optimization techniques can be applied to the electrochemical treatment of dye wastewater, which was not previously shown. More interestingly, the ANN-optimised results were similar to the values in Table 1, especially when compared with the conventional wastewater treatment plants. Minor differences were attributed to the side reactions and buildup of intermediate transformation byproducts from different oxidation pathways, affecting the pollutant removal efficiency. The discrepancies in results are attributed to the type of pollutant under treatment, unit operations, and other laboratory parameters.

Table 1 Comparison of ANN versus SVM models using different wastewater treatment technologies

To compare and validate ANN, the Random Forest optimization technique using the CART algorithm was used to evaluate this work. The following Fig. 9 presents the general process of random forest by CART algorithm (Wang et al. 2022):

Fig. 9
figure 9

Random forest process is an ensemble learning method or algorithm for classification and regression by operating a multitude of decision trees at different training times. The step-by-step procedure used to construct the decision trees is stipulated

Figure 9 shows the random forest process and computational procedure for generating the regression trees or optimal tree diagrams. Figure SM1a shows that the optimal tree diagrams using random forest can be used to analyse the energy efficiency of three-dimensional electrochemical process by finding the optimum current density for electrolysing the MO textile wastewater. The optimal tree diagram in Figure SM1a in Supplementary File shows that when the current density dropped below 20 mA/cm2, the predictive analytics showed terminal node 1 with percentages around 7.1% for a range of calculations for electrical energy consumption of MO. The optimization results indicated that any current density below 20 mA/cm2 can achieve better energy efficiency than higher current density. When the current density was between 20 and 30 mA/cm2 at terminal node 2, the prediction results indicated that the electrical energy consumption of MO was higher than terminal node 1, indicating lower energy efficiency when the current density increased beyond 20 mA/cm2. However, when the current density increased beyond 30 mA/cm2, the energy efficiency decreased more significantly. The patterns of electrical energy consumption of TOC for Figure SM1b in Supplementary File were similar to Figure SM1a except that the amounts of energy consumption of TOC were higher than the typical energy consumption of MO due to greater electrical energy required to mineralize the dye contaminants in aqueous solutions. The efficiency analysis tree approach can optimize or monitor the energy usage in the electrochemical process within the WWTPs to substantially benefit people and the environment, reducing operational costs and greenhouse gas emissions significantly (Maziotis and Molinos-Senante 2023; Maziotis et al. 2023).

In conjunction with ANN and SVM models, multiple regression analyses in Fig. 10a-d showed variations between the fitness of experimental and predicted values. In addition, multi-regression analysis results presented in Fig. 10b shows a small residual error between the experimental and predicted values. The result indicates a slight variation between the experimental and predicted values when determining the adequate current efficiency required to facilitate oxidation reactions. The benefits of optimisation using multi-regression analysis are due to more controllability over the process parameters while maintaining the energy efficiency of the oxidation reactions. The results from multiple regression analysis in Fig. 10d showed minimal residuals between the experimental and predicted results, indicating that the prediction efficiency was a good fit for optimising the process parameters. On the other hand, the results from Fig. 10c and d show very minimal residuals between the experimental and predicted values, indicating that the prediction efficiency was a good fit with R2 values greater than 0.95. However, both ANN and SVM models yielded high R2 values, highlighting its superior optimisation power over multiple regression functions. In addition, a two-layer feed-forward network imparted with hidden sigmoid neurons and linear output neurons can solve problems for multidimensional mapping to improve curve fitting to match the data. MLP network (3:4:4) was trained with the Levenberg–Marquardt backpropagation algorithm (LMBPA). MSE varied in validation samples, and the training automatically stopped or adjusted to improve the generalisation. The network was trained for 4 replications to find the best number of neurons for the hidden layer.

Fig. 10
figure 10figure 10

a) Multiple regression analysis for optimisation of MO removal efficiency at 50 mg/L of initial MO concentration; b) Multiple regression analysis for optimisation of current efficiency using current densities ranging from 15 to 35 mA/cm2; c) Multiple regression analysis for optimisation of electrical energy consumption of MO using current densities ranging from 15 to 35 mA/cm2; d) Multiple regression analysis for optimisation of electrical energy consumption of TOC using current densities ranging from 15 to 35 mA/cm2. The significance of using multiple regression analysis is to analyse the relationship between dependent variables and several independent variables to predict the outcome of the dependent values. In this case, multiple regression is used to compare the effects of initial MO concentrations and current densities on MO removal efficiency and electrical energy consumption of MO and TOC, respectively. The predictive outcomes of multiple regression analyses can be compared with the prediction efficiency of AI and machine learning techniques for validation

Finally, the multi-regression analysis in Fig. 10c and d shows minor residual errors or variations between the predicted and experimental values, indicating that the optimisation technique can achieve more robust process conditions by adjusting parameters. However, the ANN optimisation method provided the best prediction result over control of process parameters compared to multi-regression analysis. In addition, Table 2 summarises model validation by ANOVA analysis to compare differences between AI/ML optimisation techniques and multiple regression fit to actual versus predicted values for examining the three-dimensional electrochemical treatment of 50 mg/L MO using a current density of 15 mA/cm2.

Table 2 Model validation by comparing AI/ML optimisation and ANOVA analysis through multiple linear regression fit to actual versus predicted data

3.1.2 Monte Carlo Simulations

The uncertainties associated with different ML predicted models were estimated using Monte Carlo simulations. The uncertainty and variability of input parameters influence the estimation of uncertainty. Compared to actual data, the predicted variables have an inherent uncertainty in estimating response variables. The Monte Carlo simulation was based on the repeated random sampling (n = 1,000 simulated samples) of the probability distributions defined for principal response variables of certain variation and uncertainty of each input parameter. The Monte Carlo approach allows the approximate estimation of variation and uncertainty stemming from system perturbation associated with specific input parameters and incorporating them into the estimates of response variables. In addition, simulations with 1,000 iterations were used to construct the distributions to calculate the level of uncertainties in different predictive model platforms. The simulated parameters can be extended beyond the current number of operating parameters. Uncertainty analyses in wastewater treatment systems compare the reliability of results, which is subject to variability that leads to significant imprecision in the predictive model platforms. The quality of wastewater treatment standards is based on rigorous regulation of water quality criteria to monitor the risk of adverse effects on the receiving bodies. This research aims to apply Monte Carlo simulation to assess the probability of adverse effects of xenobiotic dye wastewater in meeting environmental standards for effluent discharge. The achievable limits for textile dyeing effluent standards can be evaluated based on the simulated models to adhere to water quality standards.

Appropriate selection of certain input distributions to estimate uncertainty and variability between the actual and predicted models from ML optimisation helps facilitate probabilistic analysis of optimized results. A distribution is determined based on how well it represents a certain dataset from the actual experimental results. The best representation of probabilistic distributions can be empirical or take any form of parametric distributions such as normal, logarithmic normal, uniform, triangular etc. All parameters in this study were assumed to be normal or logarithmic normal distributions. When a certain number of random variables influences the dataset, the result tends to form a normal distribution as shown in Figure SM2 in Supplementary File. A theoretical criterion for selecting a certain normal distribution is based on a central limit theorem (CLT). In addition, Figure SM2 shows the probabilistic density distributions of actual and optimized models. The uniformity of probabilistic distributions and then lack of skewness or heavy-tailedness highlight the prediction efficiency of AI/ML optimized models with limited uncertainty or variability. However, the probability distribution for different artificial intelligence and machine learning-based models showed that the higher the efficiency of targeted responses, the greater the uncertainty, which impacted the accuracy and precision of predictive models. The random forest algorithm generated greater uncertainty than other artificial neural network and support vector machine algorithms, indicating greater instability of system perturbation predicted by random forest. The simulated normal distribution seen in all artificial intelligence and machine learning-based models showed that most of the targeted responses achieved efficiencies within the 90% and 100% range. The strong correlation between the current density and targeted responses based on the sensitivity analysis indicated that current density had the most significant effect on the pollutant removal efficiency. When the predicted models were combined, especially between ANN and SVM, the level of uncertainties or system perturbation increased, leading to greater variability of combined predictive models. Similarly, if the ANN and RF models were integrated into a single predictive model platform, the level of uncertainties in prediction efficiency was less than the ANN-SVM model.

4 Conclusions

The results above confirmed that 3D electrochemical treatment integrated with graphite intercalation compound (GIC) particle electrodes and anodic oxidation technology is a very efficient technique to degrade methyl orange (MO) pollutants in simulated wastewater, achieving greater than 98% removal rate within 30 min of electrolysis time. The GIC particle electrodes in the 3D electrochemical process act as an electrocatalytic adsorbent material to effectively improve mineralisation efficiency and generation of.OH oxidants, demonstrating the effectiveness of combined adsorption and electrochemical oxidation. However, the strength of electrolysis in this experiment was limited by the type of electrocatalytic material used, and the acidified salt concentration was also limited, resulting in slightly reduced electrical conductivity of solution, and less active chlorine species available to degrade MO pollutants. Nonetheless, the research results justify the application potential of green and efficient 3D electrochemical treatment of complex industrial wastewater. The synergistic effect of 3D electrochemical process resulted in high MO removal and current efficiencies, reducing overall electrical energy consumption. In addition, GIC particle electrodes consistently maintained high regeneration efficiency beyond 100% throughout several consecutive cycles of adsorption and regeneration, highlighting the potential for reusability of particle electrodes. The AI optimisation power of multi-regression analysis, ANN, SVM and random forest ranked in the following order: ANN > RF > SVM > multiple regression analysis. The probabilistic distributions and scatterplots from Monte Carlo simulations indicated limited uncertainty and variability between actual and optimised models, highlighting the prediction efficiency of AI/ML optimisation approaches that are potentially applicable to water resources engineering and wastewater remediation in WWTP. Most interestingly, the overall critical findings of the research showed that RF is intrinsically suited for analysing multiclass problems, while SVM is only suited for two-class problems. In this research, the predictive performance of RF versus SVM was approximately comparable due to almost equal uncertainties. In contrast, the ANN algorithm yielded significantly better prediction efficiency than the other two algorithms with fewer uncertainties. Although RF is considered robust to overfitting and excellent in handling extensive nonlinear data, SVM can effectively operate at high dimensional spaces or hyperplanes and is versatile in handling multiple data types. The predictive performance of these algorithms is primarily influenced by the sample size, the complex nature of the dataset and the type of problem being addressed. The subsequent studies should focus on evaluating other equally robust classifiers for optimising the electricity costs from industrial operation and greenhouse gas emissions of WWTP to identify the potential gap between pollutant generation and discharge sources to improve the efficacy and broaden the applicability of optimised models.