Introduction

According to the Food and Agricultural Organization (FAO), freshwater is one of the most significant commodities required for the sustenance of human life. Meanwhile, the presence of organic and inorganic pollutants resulting from the unrestricted discharge of effluents from various industries (desalination, oil & gas, abattoir, etc.) significantly degrades water quality, thus, making it unsuitable for use as a direct source of potable water for industrial applications (Panagopoulos 2021). These issues have worsened the problem of freshwater scarcity over the last two decades (Panagopoulos 2022; Hashem et. al. 2021a).

Abattoir industries are very lucrative, especially in developing countries like Nigeria, as they provide a means to the major source of protein supply and also account for about 21% of the country’s agricultural GDP (Ogbeide 2015). In addition to the nourishment and huge economic derivatives accruable from the abattoir industries, the issue of generation of a considerable volume of contaminated wastewaters (abattoir wastewater, AWW) during their operations still subsists. Studies have shown that these AWW are characterized by a variety of organic pollutants, which serve as substrates for ammonia–nitrogen (A-N) generating microorganisms (Elemile et al. 2019; Lin et al. 2014; Lopes et al. 2022). Due to the high toxicity of A-N, which has been extensively reported (Elemile et al. 2019; Haseena et al. 2016; Halim et al. 2010), their total elimination or considerable reduction in AWW is plausible.

Many techniques such as chemical precipitation, nitrification and denitrification, electrochemical coagulation, and membrane distillation (Haseena et al. 2016; Halim et al. 2010; Chen et al. 2018; Jorgensen et al. 2003; Mao et al. 2018; Qiang et al. 2020) have been successfully adopted for the removal of A-N from different sources. The reliability of adsorption techniques for A-N removal from AWW is attributed to the operational flexibility, cost-effectiveness, fast kinetics, and it is insensitive to ammonia toxicity as highlighted by several authors (Mao et al. 2018; Qiang et al. 2020; Ren et al. 2021; Mirahsani et al. 2019). Different adsorbents such as cross-linked chitosan, activated carbon and sawdust, etc., have been applied for AWW treatment (Atangana and Chiweshe 2019; Agarry and Owabor 2012; Djonga et al. 2019). However, this study evaluated the effectiveness of crab shell-based adsorbent for the chelation of A-N from AWW.

Composed mainly of calcium carbonate, chitin, and proteins (Jeon et al. 2019; Ohale et al. 2020), the properties of crab shell biosorbent are linked to their rigid structure, high mechanical strength, and ability to withstand extreme conditions employed during the synthesis and adsorption process (Jeon et al. 2019). Jeon et al. (2019) and Jeon (2015), studied the removal of chromium and silver ions from an aqueous solution, using amino-functionalized crab shell and immobilized crab shell beads. Using crab shell particles, Vijayaraghavan et al. (2011), investigated the removal of Mn (II) and Zn (II) from aqueous solutions. Also, iron functionalized activated carbon was successfully applied for the uptake of aqueous silver (Wang et al. 2019). Currently, the removal of A-N has focused on its adsorption from municipal wastewater sources (Qiang et al. 2020; Cheng et al. 2019). However, to our best knowledge, no work has been published on the adsorptive removal of A-N from AWW using crab shells, thus making this work imperative.

In the past, a uni-variant procedure (OFAT) was commonly adopted for evaluating the adsorptive removal of A-N (Hodur et al. 2020; Wang et al. 2018; Arslan and Veli 2011). However, OFAT approach is usually cumbersome, time-consuming, and rarely satisfies the search for the desired optimum. Using multi-variant empirical techniques which examine the simultaneous variations of process factors on the response, the limitations of the OFAT technique can be circumvented (Onu et al. 2020). Many nonlinear analytical techniques have been previously applied for nonlinear system modeling (Betiku et al. 2018, 2016), but response surface methodology (RSM), artificial neural network (ANN), and adaptive neuro-fuzzy inference system (ANFIS) have consistently shown good modeling and simulation results (Chen et al. 2015; Myers et al. 2009). RSM and ANN are flexible mathematical tools, and artificial intelligence (AI) algorithms, respectively, used for the design, modeling and optimization of majorly nonlinear systems (Chen et al. 2015; Ohale et al. 2017). Literature has documented evidence of researchers’ strong preference for ANN in data modeling in comparison with RSM (Wang et al 2018; Tu et al. 2019). This finding may not be unconnected with its (ANN) prediction accuracy, and minimal modeling data requirement (so long the data are statistically well distributed in the input domain) (Mahanty et al. 2020; Hariram et al. 2019; Dehghani et al. 2019; Dehghani et al. 2020).

Comparative assessment of ANN and RSM (Onu et al 2020; Betiku et al. 2018; Ohale et al. 2017), as well as those between ANN and ANFIS (Onu et al. 2021a; Dastorani et al. 2010; Kiran and Rajput 2011) in process modeling, has been reported by several authors. While some authors have verified the dominance of both ANN and ANFIS over RSM (Onu et al. 2020, 2021a, 2021b), Taheri et al. (2013) and Sajjadi et al. (2016) reported that RSM performed better than ANFIS. Furthermore, Kiran and Rjput (2011) observed that ANN performed better than ANFIS, even as Dastorani et al. (2010) confirmed the superiority of ANFIS over ANN. Betiku et al. (2016) established that both ANFIS and ANN were better than RSM, whereas ANN was slightly superior to ANFIS in data prediction accuracy. These inconsistent reports underscore the need to compare the predictive fitness of RSM, ANN and ANFIS under identical conditions, particularly as it relates to A–N uptake from AWW. Regardless of the numerous comparative applications of RSM, ANN and ANFIS in modeling of different processes (Zarghi et al. 2020; Ting et al. 2020; Arabameri et al. 2015), no article exists on their application in the removal of A–N. This observation forms a major motivation for the work; hence, this present work is justified.

Therefore, the study aims at synthesizing a novel iron-functionalized crab shell-based adsorbent with high A-N uptake capacity via thermal and chemical (impregnation) activation methods. Iron was chosen as a modifying agent owing to its special properties, namely catalysis, reduction, and increased electron transfer efficiency, which can improve the properties of CS with a larger surface area (Qin et al. 2020; Lyu et al. 2019). The pre- and post-adsorption characteristics of the synthesized novel adsorbent obtained, as well as the synergistic effects of process variables, were evaluated using RSM, ANN and ANFIS. Using genetic algorithm and RSM, the adsorptive system was further optimized, while the probable rate-controlling step and thermodynamics considerations during adsorptive uptake were elucidated.

Experimental

Materials

Crabshell was obtained from a waste disposal site in Badagry, Lagos State, Nigeria. The shells were washed and dried at a temperature of 378 K. With the aid of a mill apparatus, the dried shells were powdered and sieved to pass through 0.2 mm mesh size and subsequently stored in an airtight container for further processing.

Abattoir wastewater (AWW) was obtained from a local slaughterhouse in Amasea, Anambra state, Nigeria. AWW was characterized following standard procedures for water and wastewater analysis (APHA 2005; AWWA 2005), and the results are presented in Table 4. The characterized AWW was filtered to remove all solid particles capable of clogging the surface of the adsorbent and subsequently preserved in a dark-amber colored container for further analysis.

Analytical grade reagents of sulfuric acid (H2SO4), sodium hydroxide (NaOH), and iron nitrate (Fe (NO3)3) were purchased from Parchem limited, New Rochelle, New York, USA.

Adsorbent preparations

Fifty grams of ground crab shell was contacted with 400 ml of 0.3 M Fe(NO3)3 at 318 K for 7 h. The product was thoroughly washed with deionized water to remove all residual traces of the chemical. After drying, the chemically active crab shell was calcined for 3 h in a muffle furnace at 593 K, and the resultant product (iron-functionalized crab shell, Fe–CS) was cooled and stored for further use.

Instrumental characterization

Physicochemical properties of crab shell (CS), Fe–CS, and A–N-loaded Fe–CS were investigated using instrumental characterization. The functional groups, topographical equilibrium and crystalline structure of the samples were determined via Fourier transform infrared spectroscopy (FTIR – Thermo Nicolet Nexus, Model 470/670/870), scanning electron microscopy (SEM – Model Zeiss Evo MA – 17 EDX/WDS microscopy), and X-ray diffraction (XRD – Philips XPERT X-RAY diffraction unit), respectively. Additionally, the thermal strength was analyzed by thermo-gravimetric analysis (TGA – Mettler Toledo TGA/SDTG 851). FTIR, XRD, SEM, and TGA analyses were carried out following ASTM E1421–99, ASTM F1185–88, ASTM E2809, and ASTM D3418 standard procedure, respectively.

Adsorption studies

The A–N removal efficiency was investigated at varying pH (4, 5, 7, 9, 10), Fe-CS dosages (0.7, 1.0, 1.6, 2.2, 2.5 g), initial A–N concentrations (1.7, 16.0, 44.5, 73.0, 87.5 mg/L), temperature (305.5, 308.0, 313.0, 318.0, 320.5 oK), and time (30, 60, 120, 180, 210 min). The experimental template (see Table S1) comprises 32 sets of individual runs, whose uniqueness lies in their input parameters synergy. The solution pH was adjusted with adequate drops of 0.5 M H2SO4 and 0.6 M NaOH and standardized with Hanna pH instruments (Model H12002–02). After pH adjustment, desired amounts of Fe–CS were added to 50 ml of AWW, and the mixture was stirred at 180 rpm for the specified adsorption time. The Fe–CS/AWW mixture obtained at the end ofcc each experimental run was separated by centrifugation and the supernatant was withdrawn for A–N analysis, while the spent Fe–CS adsorbate was recovered for characterization.

The experimental kinetic data used for mechanistic modeling was obtained by studying the effect of time (15–300 min) and concentration (15.0–75.0 mg/L) on the A–N adsorption capacity (qt). The result of this study is illustrated in supplementary material. The equilibrium A–N concentration was determined by Nessler’s Reagent spectrophotometry, while removal efficiency and adsorption capacity were determined using Eq. (1) and Eq. (2), respectively.

$$A - N removal eff. \left( \% \right) = \frac{{C_{0} - C_{t} }}{{C_{0} }} \left( {100} \right)$$
(1)
$$q_{t} = \frac{{C_{0} - C_{t} }}{m} V$$
(2)

\(C_{0}\) = Initial concentration of A–N (mg/L), \(C_{t}\) = residual concentration of A–N (mg/L), V = volume of AWW per batch (L), m = mass of Fe–CS (g) and \(q_{t}\) = adsorption capacity at time, t (mg/g).

Predictive and mechanistic modeling

RSM

The literature review showed the dependence of A-N removal efficiency on five process variables (pH, adsorbent dosage, initial concentration, adsorption temperature, and contact time) (Ren et al. 2021; Tu et al. 2019; Li et al. 2020; Couto et al. 2016; Kizito et al. 2015; Hodur et al. 2020; Wang et al. 2018; Arslan et al. 2011). The RSM–Central Composite Design provides the process variables’ synergistic study framework using an optimal number of experimental runs. Given that all the parameters are measurable, the mathematical relationship between independent variables and removal efficiency is expressed by the second-order polynomial function (Eq. 3). To estimate the statistical significance of each term in the polynomial function, the independent variables and corresponding responses were analyzed using analysis of variance, ANOVA (Table 5) (Betiku et al. 2018; Ohale et al. 2017).

$$y = b_{0} + \sum b_{i} X_{i} + \sum b_{ii} X_{ii} + \sum b_{ij} X_{i} X_{j} + \varepsilon$$
(3)

For statistical analysis, the experimental variable has been coded as shown in Eq. 4:

$$x_{i} = \frac{{X_{i} - X_{n} }}{{\Delta X_{i} }}$$
(4)

where \(x_{i}\) is the coded value of ith independent variable, \(X_{i}\) is the actual value of the ith independent variable, \(x_{n}\) is the actual value of ith independent variable at the center point and, ΔX is the step change value of a real variable (Betiku et al. 2018; Ohale et al. 2017). The design matrix (Table S1) is generated based on five level input parameters given in Table 1 and the response surface methodology was carried out using the Design-Expert software 11.0 trial version (Stat-Ease Inc., Minneapolis, USA).

Table 1 Levels of independent variables for central composite design

ANN

Multi-Layer Perceptron (MLP), aided with Marquardt Levenberg’s backpropagation algorithm, was used in developing the ANN. The MLP consists of five input variables making up the input layer, with the A–N removal efficiency representing the output neuron (See Fig. 1a). Neuron input consists of its bias and the sum of its weighted input. The mathematical expression describing the neuron is given in Eq. 6, while the data set employed in ANN modeling was the same as those used in RSM (see Table S1). Seventy per cent (70%) of the data set was used for network training, while the remaining 30% was evenly used for validation and testing set (Ohale et al. 2017). To eliminate the influence of large–value process variables, all input parameters and responses were normalized using Eq. 5. The ANN was executed in MATLAB R 2015 b (Mathworks, Inc.).

$$X_{norm.} = \frac{{X_{i} - X_{\min .} }}{{X_{\max .} - X_{\min .} }}$$
(5)
$$Y_{i} = \mathop \sum \limits_{i = 1}^{n} x_{i} \omega_{i} + \theta_{i}$$
(6)

where \(X_{norm.}\) represents the normalized value of \(X_{i}\), \(X_{\min .}\) and \(X_{\max .}\) denotes the minimum and the maximum values of the data set. The Yi is the net input to the node, \(i\) is the hidden layer, \(\omega_{i} \left( {i = 1, n} \right)\) denotes the connection weights, \(\theta_{i}\) represents the bias and \(x_{i}\) is the input parameter.

Fig. 1
figure 1

Architectural framework of a ANN b ANFIS

The weighted output was subjected to a nonlinear activation function (Tu et al. 2019), while the logistic output function is given in Eq. (7).

$$sf \left( {sum} \right) = \frac{1}{{1 + \exp \left( { - sum} \right)}}$$
(7)

The hidden number of neurons was arbitrarily varied from 2.0 to 12.0, and a suitable number of the hidden neuron was chosen based on the results of regression and error function analysis obtained by applying Eqs. (8) and (9). Meanwhile, the graphical result of this test is presented in Fig. 5e.

$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {y_{\exp .\left( i \right)} - y_{{{\text{pred}}.\left( i \right)}} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left( {y_{{{\text{pred}}.\left( i \right)}} - y_{{\exp .{\text{ave}}.}} } \right)^{2} }}$$
(8)
$${\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {y_{{{\text{pred}}.\left( i \right)}} - y_{\exp .\left( i \right)} } \right)^{2} }}{N}}$$
(9)

where \(N\) is the number of data points, \(y_{{{\text{pred}}.\left( i \right)}}\) is the ANN prediction, \(y_{\exp .\left( i \right)}\) is the actual experimental response, \(y_{\exp .ave.}\) is the mean value of experimental data and \(i\) is the data index (Onu et al. 2020).

ANFIS

A fuzzy model for the prediction of A–N removal efficiency was initiated using a feed-forward neural network structure. The network consists of a multi-input single-output (MISO) ANFIS model which was developed using the fuzzy logic toolbox of MATLAB R 2015 b (Mathworks Inc.). The architectural framework of the suggested ANFIS model, whose structure involved five discrete layers (Betiku et al. 2016) is presented in Fig. 1b. First-order Sugeno inference systems that convert input parameters into membership functions (MF) were employed for this investigation. To further analyze the ANFIS framework presented in Fig. 1b, the fuzzy inference system was assumed to have two input variables (\(x, y\)) and one output (\(f)\), as a result, the fuzzy ‘IF–THEN’ rules apply as expressed in Eqs. (10) and (11) (Betiku et al. 2016; Betiku et al. 2018).

$${\text{Rule }}1:{\text{IF}} x is A_{1} {\text{and}} y {\text{is}} B_{1} , {\text{then}} f_{1} = k_{1} x + l_{1} y + m_{1}$$
(10)
$${\text{Rule }}1:{\text{IF}} x {\text{is}} A_{2} and y {\text{is}} B_{2} , {\text{then}} f_{2} = k_{2} x + l_{2} y + m_{2}$$
(11)

where \(A_{i}\) and \(B_{i}\) are fuzzy sets, \(f_{i}\) is output, \(k_{i} , l_{i}\) and \(m_{i}\) are adjustable parameters set during ANFIS training [38]. Five layers of the ANFIS model were explained via Eqs. (12)–(16).

Layer 1 All nodes \(\left( i \right)\) in the first layer are defined by the function in Eq. (12).

$$O_{i}^{1} = \mu_{{A_{i} }} \left( x \right)$$
(12)

where \(x\) is the input node, \(i\) and \(O_{i}^{1}\) are the membership grade of fuzzy set \(A_{i}\).

Layer 2 Every node \(i\) approximates the weight of each membership function (MF) by way of multiplication (Betiku et al. 2018), as shown in Eq. (13).

$$O_{i}^{2} = w_{{i_{ } }} = \mu_{{A_{i} }} \left( x \right)*\mu_{{B_{i} }} \left( y \right);\left[ {i = \, 1, \, 2} \right]$$
(13)

Layer 3 In this third layer, the approximated MFs are normalized by computing each activation level using Eq. (14).

$$O_{i}^{3} = w_{{i_{ } }} = \frac{{w_{i} }}{{w_{i} + w_{2} }},\left[ {i = \, 1, \, 2} \right]$$
(14)

Layer 4 The fourth layer is used to compute the output by de-fuzzing the MFs via Eq. (15).

$$O_{i}^{4} = \overline{{w_{i} }} * f_{i} = \overline{{w_{i} }} \left( {k_{i} *x* l_{i} *y + m_{i} } \right)$$
(15)

where \(\overline{{w_{i} }}\) is the output of the third layer and \(k_{i}\), \(l_{i}\), and \(m_{i}\) are parameter set.

Layer 5 The fifth layer is a single non-adaptive node used to compute the overall output using Eq. (16).

$$O_{i}^{5} = \mathop \sum \limits_{i} \overline{{w_{i} }} *f_{i} = \frac{{\mathop \sum \nolimits_{i} w_{i} * f_{i} }}{{\mathop \sum \nolimits_{i} w_{i} }}$$
(16)

Appraisal of predictive models

To ascertain the superiority of either ANN, RSM, or ANFIS in predicting the A-N removal efficiency, the respective model predictions were compared using error function indices (such as regression coefficient, R; coefficient of determination, R2; adjusted–R2, absolute average relative error, AARE; % Marquardt’s percent standard deviation, MPSED; root-mean-square error, RMSE; standard deviation, SD; the sum of squares error, SSE and Hybrid fractional error function HYBRID %) (Onu et al. 2020; Betiku et al. 2016; Betiku et al. 2018; Ohale et al. 2017; Agu et al. 2020). Mathematical expressions of the error indices (Eqs. 1725) are tabulated in Table 2.

Table 2 The error function and adsorption mechanistic models

Process optimization

To determine the best parameters for optimum A-N removal, RSM and GA optimization techniques were adopted. To execute RSM optimization, the removal efficiency was set at maximum desirability, while the process factors were designated within the experimental constraints. For genetic algorithm (GA), a statistical exploration technique, capable of.

simulating a natural biological evolution was used in solving the optimization problems. The developed models (RSM, ANN, ANFIS) were coupled with GA and used as a decision parameter in GA optimization (which occurs through a 4-staged cycle). The cycle was sustained until the attainment of a desirable outcome; thus, the best sequence produced at the convergence of the above-described loop becomes the solution to the optimization problem (Betiku et al. 2016, 2018). RSM optimization was implemented using Design-Expert 11.0 trial version (Stat-Ease Inc., Minneapolis, USA), while GA optimization was carried using the optimization toolbox of MATLAB R 2015 b (Mathworks Inc.). The method of the GA algorithm is illustrated in supplementary material, Fig. S3.

Mechanistic modeling

Molecular diffusion is predictably influenced by film diffusion, pore diffusion or mass actions. However, the effect of mass action occurs rapidly; hence, it is considered negligible in the adsorption kinetics. Therefore, the liquid film adsorption mechanism is principally controlled by either film or pore diffusion (Ohale et al. 2020; Aniagor et al. 2018). To investigate the rate-limiting step in the adsorptive process of A–N onto Fe–CS, mechanistic models listed in Table 3 are applied (Ohale et al. 2020; Onu et al. 2020; Aniagor et al. 2018).

Table 3 The expression of the mechanistic models applied in the study

Results and discussion

AWW characterization

The AWW is typically assessed using the parameters tabulated in Table 4. From the results shown, the amount of A-N present in the raw effluent was significantly higher than the stipulated discharge limit by NESREA and WHO (Ohale et al. 2020; Onu et al. 2020). Aside from the pollutant of interest (A–N), other wastewater characterization indicators such as biochemical oxygen demand (BOD5) and chemical oxygen demand (COD) were considerably higher than the specified discharge limit. This may be due to the presence of substantial amounts of organic matter in AWW (Okey-Onyesolu et al. 2020). The recorded pH values ranged between pH 7.0 and pH 7.4, the values which are well within the tolerable NESREA, WHO and FEPA discharge limits (Okey-Onyesolu et al. 2020; Onu et al. 2020). The high A–N content of our AWW sample justifies the need for predisposal treatment.

Table 4 The physicochemical characteristics of AWW

Characterization of CS, Fe–CS and Fe–CS loaded A–N

FTIR analyses

FTIR is a dynamic technique that provides valuable information regarding the surface chemistry of substances. Depicted in Fig. 2a are the FTIR spectra of CS, Fe–CS and Fe–CS-loaded A–N. The FTIR spectrum of CS indicated the presence of important peaks at wave-numbers 3324.8, 2903.6, 1628.8, 1582.6, 1428.8, 1241.2, 1032.5, and 834.9 cm−1. The waveband at 3324.8 cm−1 is attributed to the presence of an aromatic group, while the strong absorption peak at 1032.5 cm−1 demonstrates the presence of an aliphatic C-H stretching group. The existence of amide—I (C = O secondary amide stretch) and amide—II (C-N stretch, N–H bend) in the CS were represented by waveband at 1628.8 and 1582.6 cm−1, respectively, while those at 2903.6 cm−1 denotes the presence of amide C-H stretching. The presence of phosphorus compound of P-F stretching and methyl group with C-N bending was illustrated by adsorption band at 834.9 cm−1 and 1428.8 cm−1, respectively. A similar observation was reported by Ohale et al. (2020).

Fig. 2
figure 2

a FTIR spectra and b XRD pattern

The post-functionalization FTIR spectrum of Fe-CS displayed several recognizable peaks. The bands at 3324.8, 2903.6 and 1628.8 cm−1, associated with aromatic and amide groups of CS were retained, although with diminished intensities. Slight deviations in the 1032.5–1049.1 cm−1 wave-numbers are assigned to the -OH deformation vibration due to the Fe-CS thermal activation. The appearance of a new peak at 710.8 cm−1, in addition to the observed alterations in some others (for instance, the wave-numbers at 1428.8 and 834.9 cm−1 shifted to 1440.3 and 873 cm−1, respectively), reveals characteristic calcite spectra (Ohale et al. 2020; Dai et al. 2017).

The post-adsorption FTIR spectrum of Fe–CS-loaded A–N showed obvious wave-number shifts. The C = O amide stretch previously domicile at 1628.8 shifted to 1619.1 cm−1, with a corresponding intensity reduction. The sharp peak at 1049.1 cm−1 further shifted to 1088.3 cm−1, while the band at 873.2 cm−1 slightly deviated to 881.3 cm−1. These vibrational deviations and intensity reduction, especially as it relates to the amide functional groups, suggest their significant contributions during the A-N adsorptive uptake.

XRD analyses

The crystallographic features of CS, Fe–CS, and Fe–CS-loaded A–N determined via XRD technique are depicted in Fig. 2b. The XRD pattern of CS portrays a well-structured spectral pattern with prominent 2-theta reflections at 9.5°, 22.9°, 29.4°, 43.2°, and 47.6°. The XRD pattern of Fe–CS exhibited similar 2-theta reflections as CS; however, stronger peaks were observed for Fe–CS at 22.9° and 29.4°. The pronounced peaks at 22.9°, 29.4°, and 43.2° indicate that the principal crystal in CS and Fe–CS was calcite. Meanwhile, the higher 2-theta reflection intensity at 29.4° in Fe–CS illustrated a significant concentration of calcite crystal in Fe–CS compared to CS. The presence of ferric-based crystal in Fe–CS was illustrated by the 2-theta reflection at 44.6°. This observation is just as expected because iron nitrate was utilized during the CS functionalization process. Similar observations have been reported elsewhere (Ohale et al. 2020; Dai et al. 2017). The post-adsorption XRD pattern indicated minor alterations in the Fe–CS-loaded A–N structural configuration, thus, illustrating the crystalline stability of Fe–CS despite the A–N adsorption.

SEM analyses

SEM analyses fundamentally examine the particle size, porosity, and morphological properties of any given adsorbents. The surface morphology of CS (Fig. 3a) depicts the existence of a layer-stacking structure with irregular flakes, an indication that the CS was composed of fibrillary structure and crispy rough edges (Ohale et al. 2020). Figure 3b (SEM image of Fe–CS) shows the development of superior surface cohesion, with moderate thin layer tissues and a coral-like porous structure. These improvements in the surface properties of Fe–CS are attributed to the modification techniques carried out during the CS functionalization procedure. The morphological micrograph of Fe–CS-loaded A–N as depicted in Fig. 3c shows the appearance of an intense river-like morphology, thus, validating effective adsorption of A–N onto Fe–CS.

Fig. 3
figure 3

SEM micrograph for a CS b Fe–CS c Fe–CS-loaded A–N and d TGA analyses spectra

TGA analyses

TGA was conducted to evaluate the mass stability of CS with temperature variations. The TGA results serve as a useful guide for the selection of suitable thermal activation temperatures (Ohale et al. 2020; Sebestyén et al. 2020). TGA result of CS, Fe–CS, and Fe–CS-loaded A–N is illustrated in Fig. 3d, and the respective curves portrayed three (3) thermal process stages. For CS, the first stage was recorded between 50 and 195 °C. During this period, a 6% loss of initial mass was recorded, a development which could be attributed to the volatilization of surface organic matter and water desorption. The second stage, which illustrates an accelerated mass reduction, was observed between 225 and 330 °C. This stage accounted for a 38% loss of CS mass. This massive weight loss is unconnected with the probable dehydroxylation of the OH functional group and decomposition of the acetyl groups (Ren et al. 2021). Beyond 330 °C, CS attained the final stage of thermal equilibrium, which was sustained until its termination at 600 °C.

Fe–CS and Fe–CS loaded A–N exhibited very similar thermal behavior as demonstrated in Fig. 3d. The initial stage for both samples (Fe–CS and Fe–CS-loaded A–N) occurred between 50 and 350 °C. This characteristic of high thermal stability was a direct consequence of the Fe–CS calcination step. Between 350 and 370 °C, a rapid mass loss which accounted for about 19.5 and 25% weight reduction in Fe–CS and Fe–CS loaded A–N, respectively, was recorded. However, the samples (Fe–CS and Fe–CS-loaded A–N) attained thermal equilibrium beyond 370 °C. The TGA result further showed that Fe–CS is more thermally stable than Fe–CS-loaded A–N. Such observation is not surprising, noting that Fe–CS-loaded A–N contained a substantial amount of imbibed adsorbate, which contributed significantly to its weight.

Experimental design

RSM

The combined effects of pH, Fe–CS dosage, initial concentration, temperature, and contact time on the A–N removal efficiency were studied using a central composite design. Results obtained from the respective experimental runs are presented with supplementary material (Table S1). Table 5 shows the relevant parameters generated from the analysis of variance (ANOVA). The ANOVA technique employs p value and f-value to determine the adequacy and fit goodness of the empirical models. A confidence level of 95% was used to analyze the probability of p value; thus, the lower the p values (p values < 0.05), the higher the significance of the corresponding model term and vice versa (Onu et al. 2021b). The full quadratic model and the reduced quadratic model obtained after the elimination of the insignificant terms are presented in Table 5. Meanwhile, the developed RSM model prediction is given in Eq. (32).

Table 5 Test of significance for model coefficients and analysis of variance

Besides the p values, the f-values are also useful in ascertaining the significance of each term in the quadratic model. This was accomplished by evaluating the ratio between the mean square and the residual error of the quadratic model. Hence, by comparing the models’ lack of fit parameters for the reduced quadratic model, an f-value of 107.96 (Table 5) was recorded, an implication that the quadratic model is significant, relative to the pure error. The lack of fit f-value of 1.09 also implies the lack of fit is not significant relative to the pure error. Lack of fit p-value suggests that there is a 49.35% chance that the f-value for lack of fit is attributable to noise. Furthermore, the predicted R–squared of 0.9507 is in reasonable agreement with the adjusted R–squared (R2 = 0.9822), thus, suggesting the reproducibility of the RSM model (Ohale et al. 2017; Onu et al. 2020, 2021a; Betiku et al. 2018). The adequacy of the quadratic model is evaluated using the normal plot of residuals shown in Fig. 4a. It was observed that the residuals sustain a close alignment with the normality line, thereby, confirming the normality of the residual points.

Fig. 4
figure 4

RSM plots for a Normal residuals, b residual vs. predicted, and c Pareto effect

Conversely, the plot of residuals vs. predicted values shown in Fig. 4b illustrates the random positioning of residuals around the baseline. This observation is a further indication of the suitability and accuracy of the developed quadratic model. An indication for the signal-to-noise ratio is given by the adequate precision value (APR). According to Betiku et al. (2018), for a model to effectively navigate the design space, an APR greater than 4.0 is required. Therefore, an APR of 40.789 recorded in this study (Table 5) indicates the occurrence of sufficient signals relative to the noise. Also, the obtained coefficient of variance (CV) value (3.07%) indicates that the quadratic model was satisfactorily reproducible, judging from the assertions made by Onu et al. (2021a). The effect of the respective model term on the overall removal efficiency prediction was demonstrated using the Pareto effect plot (Fig. 4c), while the influence of the corresponding factors was estimated using Eq. (33). Figure 4c shows that the Fe–CS dosage (\({x}_{2})\) sustained the greatest influence on the A–N removal efficiency, thus buttressing its (Fe-CS) adsorptive applicability.

$$\begin{gathered} A - N~{\text{rem}}.~\left( \% \right) = 45.7989x_{1} ~ - ~449.7647x_{2} + ~9.5928x_{3} ~ + ~148.8144x_{4} \hfill \\ \qquad\quad\quad\quad\quad\;\;\;\;\;\; - ~0.1576x_{5} ~~~ + ~0.0794x_{1} x_{3} - ~0.0187x_{1} x_{5} - ~0.3096x_{2} x_{3} \hfill \\ \qquad\quad\quad\quad\quad\;\;\;\;\;\; + ~1.6374x_{2} x_{4} + ~0.1952x_{2} x_{5} - ~0.0296x_{3} x_{4} - ~3.5087~x_{1}^{2} \hfill \\ \qquad\quad\quad\quad\quad\;\;\;\;\;\; - ~16.4300x_{2}^{2} - ~0.0101x_{3}^{2} - ~0.2362x_{4}^{2} - ~23591.2578 \hfill \\ \end{gathered}$$
(32)
$$P_{i} = \left[ {\frac{{b_{i} }}{{\sum b_{i} }}} \right] x 100$$
(33)

where b is the f-value for the respective model term.

ANN

The graphical expression for the topological analysis of ANN is presented in Fig. 5(a–d) the data partitioning (as a training set and test set) was conducted to eliminate over-training and over parameterization (Ohale et al. 2017; Onu et al. 2021b). Based on the hidden neurons selection criteria described in ″ANN″ Section, seven hidden neurons emerged as the most appropriate, because they depicted the least root-mean-square error (RMSE = 0.3619) and highest correlation coefficient (R2 = 0.9981) values (see Fig. 5e). Hence, the developed network was described as a 5–7–1 (five input neurons, seven hidden neurons, and one output neuron) ANN architecture. Furthermore, the correlation coefficients obtained from the regression plots were 0.9919, 0.9625, 0.9447, and 0.9686 for training, validation, testing, and overall data sets, respectively, evidence for a high correlation between experimental data and ANN predictions. The consistency of the training process was estimated using the validation performance plot shown in Fig. 5f. The best validation performance of the training network generated a mean square error of 1.5061E-04 at the 53rd epoch iteration. The negligible mean square error value recorded for the study suggests that the absence of any over-fitting difficulty within the network (Onu et al. 2021a; Nwadike et al. 2020). The estimated R2 and adjusted R2 of the ANN model were 0.9025 and 0.8945, respectively. This suggests that 90.25% of the variations in experimental and predicted values can be described by the ANN model. Significant R2 value established for the ANN model illustrates its capability in capturing the nonlinear nature of the adsorptive process of A–N onto Fe–CS.

Fig. 5
figure 5

ANN plots for a training data b validation data c test data d overall data e effect of hidden neurons and f performance evaluation

ANFIS

The Sugeno type ANFIS structure for five input parameters and one output variable generated by grid partitioning is displayed in Fig. S1 (Supplementary data). The ANFIS structure was designed using a hybrid learning procedure that incorporates the least square and gradient technique. To enhance the effectiveness of the system, the raw data were normalized using Eq. (5). Among the five tested membership functions (trimf, trapmf, gbellmf, gaussmf, and guass2mf), gaussmf was selected as the most suitable for the development of the fuzzy inference system (FIS). The ANFIS architecture and training parameters are listed in Table 6.

Table 6 ANFIS architecture and training parameters

Plots of the experimental and predicted A–N removal rates against run numbers for training, testing, checking, and overall data set are illustrated in Fig. 6(a–d), respectively. The significant spread of the interwoven data depicted in these plots is indicative of a high correlation between experimental and ANFIS predicted data. Furthermore, the calculated values R2 and adjusted R2 of the overall model performance were 0.9998 and 0.9978, respectively. This high R2 further gave credence to the ability of the ANFIS model in predicting the A–N adsorptive removal (Onu et al. 2021a). The adjusted R2 value implies that the ANFIS model can describe 99.78% of the variability between the experimental and predicted values (Betiku et al. 2018).

Fig. 6
figure 6

ANFIS plots for a training data, b testing data, c checking data, and d overall data

Model appraisal analysis

The precision of established models (RSM, ANN, and ANFIS) in estimating the A–N removal was appraised by comparing their error variance using the models presented in Table 2 and the results are presented in Table 7. According to Betiku et al. (2018), the value of R should be greater than 0.8 for an effective correlation between experimental and predicted values. Hence, the high R values (R > 0.95) obtained for the three models indicate their significant applicability in predicting experimental values. Adjusted R2 is applied for testing the extent of R2 overestimation, and its values obtained for the three models were satisfactorily sufficient, thus, validating their importance in predicting the A–N adsorptive removal. AARE was used to estimate the mean relative error between the model predictions and experimental values. However, the ANFIS model yielded negligible AARE values, a demonstration of its (ANFIS model) prediction accuracy and superiority over RSM and ANN. MPSED estimates the geometric error distribution of a system and allows for several degrees of freedom. The values of MPSED obtained for RSM, ANN, and ANFIS were 23.3%, 27.1%, and 0.03%, respectively, which demonstrates the high prediction accuracy of the ANFIS model in capturing the nonlinear nature of the adsorptive process. Low error magnitudes obtained by testing other statistical indicators (RMSE, SSE, SD, HYBRID) on the outputs (Table 7) further gave credence to the superiority of the ANFIS model in the data prediction accuracy of the present study.

Table 7 Statistical appraisal of RSM, ANN and ANFIS

In general, results obtained from statistical analysis indicate that ANFIS was the most effective model, while ANN was the least effective model in predicting the adsorptive removal of A–N from AWW onto Fe–CS. Thus, the prediction accuracy of the studied models followed the order: ANFIS > RSM > ANN. The results obtained here correlate favorably with the findings of Onu et al. (2020) and Dastorani et al. (2010).

Effect of process variables

Figure 7a and b presents the contour plots for the temperature-dosage, and temperature-concentration effects, respectively. Both figures illustrate the positive impact of temperature on the removal efficiency and the entire adsorptive process. At all levels of adsorbent dosage and effluent concentration, an increase in the system temperature resulted in a rapid removal rate, as depicted in the 2D contour plots of Fig. 7(a–d). Onu et al. (2021a, b), had earlier reported the augmentation of the adsorbate–adsorbent interaction rate and strengthening of the adsorbate ions’ mobility due to temperature increase. This phenomenon explains the observed increase in the removal efficiency upon temperature increase; thus, an endothermic process. Figure 7(b–e) depicts the 3D surface plots for the pH-dosage, and pH-concentration effects, respectively. The figures demonstrated that pH has a prominent quadratic effect on the A–N removal. At any given adsorbent dosage, increasing the pH from 4.0 to 6.8 resulted in a rapid A–N removal rate. However, increasing pH beyond pH 6.8 decreased the A–N removal efficiency (see Fig. 7b and e). Under slightly acidic conditions (pH 4.0–6.8), high amounts of H + competes with the ionized ammonia (NH4+) for available adsorption sites on Fe–CS, thus resulting in a reduced.

Fig. 7
figure 7

Surface and contour plots for the adsorption of A–N adsorption onto Fe–CS

removal rate. Similarly, from Eq. (34), it is evident that a large amount of NH4+ was converted to NH3.H2O molecule in an alkaline medium. The production of non-ionized NH3.H2O resulted in adsorption difficulties and reduced removal rate of A–N in alkaline medium.

$${\text{NH}}_{4}^{ + } + {\text{ OH}}^{-} \to {\text{NH}}_{3} .{\text{H}}_{2} {\text{O}}$$
(34)

The effect of initial concentration on the removal efficiency was studied via a variation in the A–N initial concentration from 1.75 to 87.25 mg/L (see supplementary material, Table S1). Figure 7(c–e) shows a continuous reduction in A–N removal efficiency due to a progressive increase in initial concentration. This trend is attributed to the increased occupation of readily available adsorption sites on the surface of Fe–CS. Conversely, the effect of Fe–CS dosage depicted a reverse trend (Fig. 7f), as more active sites were made available upon the Fe–CS dosage increase from 1.0–2.2 g. This increment subsequently enhanced the A-N removal efficiency, as demonstrated in Fig. 7f. Similar observations have been reported by other researchers (Ohale et al. 2020; Aniagor et al. 2018; Onu et al. 2021a, b; Okafor et al. 2015).

Process optimization

Four optimization techniques (RSM-GA, ANFIS-GA, ANN-GA, and RSM) were applied for optimizing the selected input variables (pH, dosage, concentration, temperature, and time) used for modeling the adsorptive process. The optimization is aimed at maximizing the A–N removal efficiency. The range of constraints used for genetic algorithm optimization is given in Eq. (35)–Eq. (39), while optimum values for A–N removal efficiency predicted by each method are given in Table 8. The graphical solutions for the RSM-GA, ANFIS-GA and ANN-GA optimization processes are given in Fig. 8(a–c), respectively. Judging by the figures, the removal efficiency increased steadily in a stepwise order from generation G1–G198 for RSM-GA; generation G1–G110 for ANFIS-GA; and generation G1–G182 for ANN-GA, and subsequently remained constant until termination of the process. Such observations suggest the absence of a probable crossover or mutation, with a substantive optimization effect within the parameters (Betiku et al. 2018, 2016).

Table 8 Optimization and model validation parameters
Fig. 8
figure 8

Genetic algorithm optimization for a RSM-GA b ANFIS-GA and c ANN-GA

Duplicate validation experiments were conducted at the predicted optimum conditions, and the average A–N removal efficiency was calculated and is recorded as actual removal efficiency in Table 8. ANFIS-GA gave the highest A–N removal efficiency prediction of 92.60% (at pH 6.5, 2.2 g, 18.8 mg/L, 317 K, and 156 min). The superiority of ANFIS-GA prediction performance over those of RSM-GA and ANN-GA is linked to their experimental data capturing accuracy. Therefore, regarding the quality and accuracy of the optimized process variables, the observed performance of the optimization techniques followed the trend: ANFIS-GA > RSM-GA > ANN-GA > RSM.

$$5.0 \le {\text{pH}} \le 9.0$$
(35)
$$1.0 \left( g \right) \le {\text{Fe}} - {\text{CS}} {\text{dosage}} \le 2.2 \left( g \right)$$
(36)
$$16 \left( {\frac{{{\text{mg}}}}{L}} \right) \le A - N {\text{concentration}} \le 73\left( {\frac{{{\text{mg}}}}{L}} \right)$$
(37)
$$308\left( K \right) \le {\text{Temperature}} \le 318\left( K \right)$$
(38)
$$60\left( {\min .} \right) \le {\text{time}} \le 180\left( {\min .} \right)$$
(39)

Mechanistic modeling

The adsorption kinetic data interpretation from a mechanistic viewpoint is an important step in describing the sorption process, and the accurate identification of the predominant sorption mechanism is also paramount for design purposes (Ohale et al. 2020). Generally, for an adsorption system, the solute transfer mechanism is typically characterized by either boundary layer diffusion (film) or intraparticle diffusion (pore), or both. Meanwhile, the final adsorption stages are mostly regarded as the equilibrium step, provided the adsorptive.

simulating a natural biological evolution was used in solving the optimization problems. The developed models (RSM, ANN, ANFIS) were coupled with GA and used as a decision parameter in GA optimization (which occurs through a 4-staged cycle). The cycle was sustained until the attainment of a desirable outcome; thus, the best sequence produced at the convergence of the above-described loop becomes the solution to the optimization problem (Betiku et al. 2016, 2018). RSM optimization was implemented using Design-Expert 11.0 trial version (Stat-Ease Inc., Minneapolis, USA), while GA optimization was carried using the optimization toolbox of MATLAB R 2015 b (Mathworks Inc.). The method of the GA algorithm is illustrated in supplementary material, Fig. S3.

Mechanistic modeling

Molecular diffusion is predictably influenced by film diffusion, pore diffusion or mass actions. However, the effect of mass action occurs rapidly; hence, it is considered negligible in the adsorption kinetics. Therefore, the liquid film adsorption mechanism is principally controlled by either film or pore diffusion (Ohale et al. 2020; Aniagor et al. 2018). To investigate the rate-limiting step in the adsorptive process of A–N onto Fe–CS, mechanistic models listed in Table 3 are applied (Ohale et al. 2020; Onu et al. 2020; Aniagor et al. 2018).

Results and discussion

AWW characterization

The AWW is typically assessed using the parameters tabulated in Table 4. From the results shown, the amount of A-N present in the raw effluent was significantly higher than the stipulated discharge limit by NESREA and WHO (Ohale et al. 2020; Onu et al. 2020). Aside from the pollutant of interest (A–N), other wastewater characterization indicators such as biochemical oxygen demand (BOD5) and chemical oxygen demand (COD) were considerably higher than the specified discharge limit. This may be due to the presence of substantial amounts of organic matter in AWW (Okey-Onyesolu et al. 2020). The recorded pH values ranged between pH 7.0 and pH 7.4, the values which are well within the tolerable NESREA, WHO and FEPA discharge limits (Okey-Onyesolu et al. 2020; Onu et al. 2020). The high A–N content of our AWW sample justifies the need for their predisposal treatment.

Characterization of CS, Fe–CS and Fe–CS-loaded A–N

FTIR analyses

FTIR is a dynamic technique that provides valuable information regarding the surface chemistry of substances. Depicted in Fig. 2a are the FTIR spectra of CS, Fe–CS and Fe–CS-loaded A–N. The FTIR spectrum of CS indicated the presence of important peaks at wave-numbers 3324.8, 2903.6, 1628.8, 1582.6, 1428.8, 1241.2, 1032.5, and 834.9 cm−1. The waveband at 3324.8 cm−1 is attributed to the presence of an aromatic group, while the strong absorption peak at 1032.5 cm−1 demonstrates the presence of an aliphatic C-H stretching group. The existence of amide—I (C = O secondary amide stretch) and amide—II (C-N stretch, N–H bend) in the CS were represented by waveband at 1628.8 and 1582.6 cm−1, respectively, while those at 2903.6 cm−1 denotes the presence of amide C-H stretching. The presence of phosphorus compound of P-F stretching and methyl group with C-N bending was illustrated by adsorption band at 834.9 and 1428.8 cm−1, respectively. A similar observation was reported by Ohale et al. (2020).

The post-functionalization FTIR spectrum of Fe-CS displayed several recognizable peaks. The bands at 3324.8, 2903.6 and 1628.8 cm−1, associated with aromatic and amide groups of CS were retained, although with diminished intensities. Slight deviations in the 1032.5–1049.1 cm−1 wave-numbers are assigned to the -OH deformation vibration due to the Fe-CS thermal activation. The appearance of a new peak at 710.8 cm−1, in addition to the observed alterations in some others (for instance, the wave-numbers at 1428.8 and 834.9 cm−1 shifted to 1440.3 and 873 cm−1, respectively), reveals characteristic calcite spectra (Ohale et al. 2020; Dai et al. 2017).

The post-adsorption FTIR spectrum of Fe–CS-loaded A–N showed obvious wave-number shifts. The C = O amide stretch previously domicile at 1628.8 shifted to 1619.1 cm−1, with a corresponding intensity reduction. The sharp peak at 1049.1 cm−1 further shifted to 1088.3 cm−1, while the band at 873.2 cm−1 slightly deviated to 881.3 cm−1. These vibrational deviations and intensity reduction, especially as it relates to the amide functional groups, suggest their significant contributions during the A-N adsorptive uptake from  AWW.

XRD analyses

The crystallographic features of CS, Fe–CS, and Fe–CS-loaded A–N determined via XRD technique are depicted in Fig. 2b. The XRD pattern of CS portrays a well-structured spectral pattern with prominent 2-theta reflections at 9.5°, 22.9°, 29.4°, 43.2°, and 47.6°. The XRD pattern of Fe–CS exhibited similar 2-theta reflections as CS, however, stronger peaks were observed for Fe–CS at 22.9° and 29.4°. The pronounced peaks at 22.9°, 29.4°, and 43.2° indicate that the principal crystal in CS and Fe–CS was calcite. Meanwhile, the higher 2-theta reflection intensity at 29.4° in Fe–CS illustrated a significant concentration of calcite crystal in Fe–CS compared to CS. The presence of ferric-based crystal in Fe–CS was illustrated by the 2-theta reflection at 44.6°. This observation is just as expected because iron nitrate was utilized during the CS functionalization process. Similar observations have been reported elsewhere (Ohale et al. 2020; Dai et al. 2017). The post-adsorption XRD pattern indicated minor alterations in the Fe–CS-loaded A–N structural configuration, thus, illustrating the crystalline stability of Fe–CS despite the A–N adsorption.

SEM analyses

SEM analyses fundamentally examine the particle size, porosity, and morphological properties of any given adsorbents. The surface morphology of CS (Fig. 3a) depicts the existence of a layer-stacking structure with irregular flakes, an indication that the CS was composed of fibrillary structure and crispy rough edges (Ohale et al. 2020). Figure 3b (SEM image of Fe–CS) shows the development of superior surface cohesion, with moderate thin layer tissues and a coral-like porous structure. These improvements in the surface properties of Fe–CS are attributed to the modification techniques carried out during the CS functionalization procedure. The morphological micrograph of Fe–CS-loaded A–N as depicted in Fig. 3c shows the appearance of an intense river-like morphology, thus, validating effective adsorption of A–N onto Fe–CS.

TGA analyses

TGA was conducted to evaluate the mass stability of CS with temperature variations. The TGA results serve as a useful guide for the selection of suitable thermal activation temperatures (Ohale et al. 2020; Sebestyén et al. 2020). TGA result of CS, Fe–CS, and Fe–CS-loaded A–N is illustrated in Fig. 3d, and the respective curves portrayed three (3) thermal process stages. For CS, the first stage was recorded between 50 and 195 °C. During this period, a 6% loss of initial mass was recorded, a development which could be attributed to the volatilization of surface organic matter and water desorption. The second stage, which illustrates an accelerated mass reduction, was observed between 225 and 330 0C. This stage accounted for a 38% loss of CS mass. This massive weight loss is unconnected with the probable dehydroxylation of the OH functional group and decomposition of the acetyl groups (Ren et al. 2021). Beyond 330 °C, CS attained the final stage of thermal equilibrium, which was sustained until its termination at 600 °C.

Fe–CS and Fe–CS loaded A–N exhibited very similar thermal behavior as demonstrated in Fig. 3d. The initial stage for both samples (Fe–CS and Fe–CS-loaded A–N) occurred between 50 and 350 °C. This characteristic high thermal stability was a direct consequence of the Fe–CS calcination step. Between 350 and 370 °C, a rapid mass loss which accounted for about 19.5% and 25% weight reduction in Fe–CS and Fe–CS loaded A–N, respectively, was recorded. However, the samples (Fe–CS and Fe–CS-loaded A–N) attained thermal equilibrium beyond 370 °C. The TGA result further showed that Fe–CS is more thermally stable than Fe–CS-loaded A–N. Such observation is not surprising, noting that Fe–CS-loaded A–N contained a substantial amount of imbibed adsorbate, which contributed significantly to its weight.

Experimental design

RSM

The combined effects of pH, Fe–CS dosage, initial concentration, temperature, and contact time on the A–N removal efficiency were studied using a central composite design. Results obtained from the respective experimental runs are presented with supplementary material (Table S1). Table 5 shows the relevant parameters generated from the analysis of variance (ANOVA). The ANOVA technique employs p value and f-value to determine the adequacy and fit goodness of the empirical models. A confidence level of 95% was used to analyze the probability of p value; thus, the lower the p values (p values < 0.05), the higher the significance of the corresponding model term and vice versa (Onu et al. 2021b). The full quadratic model and the reduced quadratic model obtained after the elimination of the insignificant terms are presented in Table 5. Meanwhile, the developed RSM model prediction is given in Eq. (32).

Besides the p values, the f-values are also useful in ascertaining the significance of each term in the quadratic model. This was accomplished by evaluating the ratio between the mean square and the residual error of the quadratic model. Hence, by comparing the models’ lack of fit parameters for the reduced quadratic model, an f-value of 107.96 (Table 5) was recorded, an implication that the quadratic model is significant, relative to the pure error. The lack of fit f-value of 1.09 also implies the lack of fit is not significant relative to the pure error. Lack of fit p-value suggests that there is a 49.35% chance that the f-value for lack of fit is attributable to noise. Furthermore, the predicted R–squared of 0.9507 is in reasonable agreement with the adjusted R–squared (R2 = 0.9822), thus, suggesting the reproducibility of the RSM model (Ohale et al. 2017; Onu et al. 2020, 2021a; Betiku et al. 2018). The adequacy of the quadratic model is evaluated using the normal plot of residuals shown in Fig. 4a. It was observed that the residuals sustain a close alignment with the normality line, thereby, confirming the normality of the residual points.

Conversely, the plot of residuals vs. predicted values shown in Fig. 4b illustrates the random positioning of residuals around the baseline. This observation is a further indication of the suitability and accuracy of the developed quadratic model. An indication for the signal-to-noise ratio is given by the adequate precision value (APR). According to Betiku et al. (2018), for a model to effectively navigate the design space, an APR greater than 4.0 is required. Therefore, an APR of 40.789 recorded in this study (Table 5) indicates the occurrence of sufficient signals relative to the noise. Also, the obtained coefficient of variance (CV) value (3.07%) indicates that the quadratic model was satisfactorily reproducible, judging from the assertions made by Onu et al. (2021a). The effect of the respective model term on the overall removal efficiency prediction was demonstrated using the Pareto effect plot (Fig. 4c), while the influence of the corresponding factors was estimated using Eq. (33). Figure 4c shows that the Fe–CS dosage (\(x_{2} )\) sustained the greatest influence on the A–N removal efficiency, thus buttressing its (Fe-CS) adsorptive applicability.

$$\begin{gathered} A - N~{\text{rem}}.~\left( \% \right) = 45.7989x_{1} ~ - ~449.7647x_{2} + ~9.5928x_{3} ~ + ~148.8144x_{4} \hfill \\ \qquad\quad\quad\quad\quad\;\;\;\; - ~0.1576x_{5} ~~~ + ~0.0794x_{1} x_{3} - ~0.0187x_{1} x_{5} - ~0.3096x_{2} x_{3} \hfill \\ \qquad\quad\quad\quad\quad\;\;\;\; + ~1.6374x_{2} x_{4} + ~0.1952x_{2} x_{5} - ~0.0296x_{3} x_{4} - ~3.5087~x_{1}^{2} \hfill \\ \qquad\quad\quad\quad\quad\;\;\;\; - ~16.4300x_{2}^{2} - ~0.0101x_{3}^{2} - ~0.2362x_{4}^{2} - ~23591.2578 \hfill \\ \end{gathered}$$
(32)
$$P_{i} = \left[ {\frac{{b_{i} }}{{\sum b_{i} }}} \right] x 100$$
(33)

where b is the f-value for the respective model term.

ANN

The graphical expression for the topological analysis of ANN is presented in Fig. 5a–Fig. 5d and the data partitioning (as a training set and test set) was conducted to eliminate over-training and over parameterization (Ohale et al. 2017; Onu et al. 2021b). Based on the hidden neurons selection criteria described in ″ANN″ Section, seven hidden neurons emerged as the most appropriate, because they depicted the least root-mean-square error (RMSE = 0.3619) and highest correlation coefficient (R2 = 0.9981) values (see Fig. 5e). Hence, the developed network was described as a 5–7–1 (five input neurons, seven hidden neurons, and one output neuron) ANN architecture. Furthermore, the correlation coefficients obtained from the regression plots were 0.9919, 0.9625, 0.9447, and 0.9686 for training, validation, testing, and overall data sets, respectively, evidence for a high correlation between experimental data and ANN predictions. The consistency of the training process was estimated using the validation performance plot shown in Fig. 5f. The best validation performance of the training network generated a mean square error of 1.5061E-04 at the 53rd epoch iteration. The negligible mean square error value recorded for the study suggests that the absence of any over-fitting difficulty within the network (Onu et al. 2021a; Nwadike et al. 2020). The estimated R2 and adjusted R2 of the ANN model were 0.9025 and 0.8945, respectively. This suggests that 90.25% of the variations in experimental and predicted values can be described by the ANN model. Significant R2 value established for the ANN model illustrates its capability in capturing the nonlinear nature of the adsorptive process of A–N onto Fe–CS.

ANFIS

The Sugeno type ANFIS structure for five input parameters and one output variable generated by grid partitioning is displayed in Fig. S1 (Supplementary data). The ANFIS structure was designed using a hybrid learning procedure that incorporates the least square and gradient technique. To enhance the effectiveness of the system, the raw data were normalized using Eq. (5). Among the five tested membership functions (trimf, trapmf, gbellmf, gaussmf, and guass2mf), gaussmf was selected as the most suitable for the development of the fuzzy inference system (FIS). The ANFIS architecture and training parameters are listed in Table 6.

Plots of the experimental and predicted A–N removal rates against run numbers for training, testing, checking, and overall data set are illustrated in Fig. 6(a–d), respectively. The significant spread of the interwoven data depicted in these plots is indicative of a high correlation between experimental and ANFIS predicted data. Furthermore, the calculated values R2 and adjusted R2 of the overall model performance were 0.9998 and 0.9978, respectively. This high R2 further gave credence to the ability of the ANFIS model in predicting the A–N adsorptive removal (Onu et al. 2021a). The adjusted R2 value implies that the ANFIS model can describe 99.78% of the variability between the experimental and predicted values (Betiku et al. 2018).

Model appraisal analysis

The precision of established models (RSM, ANN, and ANFIS) in estimating the A–N removal was appraised by comparing their error variance using the models presented in Table 2 and the results are presented in Table 7. According to Betiku et al. (2018), the value of R should be greater than 0.8 for an effective correlation between experimental and predicted values. Hence, the high R values (R > 0.95) obtained for the three models indicate their significant applicability in predicting experimental values. Adjusted R2 is applied for testing the extent of R2 overestimation, and its values obtained for the three models were satisfactorily sufficient, thus, validating their importance in predicting the A–N adsorptive removal. AARE was used to estimate the mean relative error between the model predictions and experimental values. However, the ANFIS model yielded negligible AARE values, a demonstration of its (ANFIS model) prediction accuracy and superiority over RSM and ANN. MPSED estimates the geometric error distribution of a system and allows for several degrees of freedom. The values of MPSED obtained for RSM, ANN, and ANFIS were 23.3%, 27.1%, and 0.03%, respectively, which demonstrates the high prediction accuracy of the ANFIS model in capturing the nonlinear nature of the adsorptive process. Low error magnitudes obtained by testing other statistical indicators (RMSE, SSE, SD, HYBRID) on the outputs (Table 7) further gave credence to the superiority of the ANFIS model in the data prediction accuracy of the present study.

In general, results obtained from statistical analysis indicate that ANFIS was the most effective model, while ANN was the least effective model in predicting the adsorptive removal of A–N from AWW onto Fe–CS. Thus, the prediction accuracy of the studied models followed the order: ANFIS > RSM > ANN. The results obtained here correlate favorably with the findings of Onu et al. (2020) and Dastorani et al. (2010).

Effect of process variables

Figure 7a and b presents the contour plots for the temperature-dosage, and temperature-concentration effects, respectively. Both figures illustrate the positive impact of temperature on the removal efficiency and the entire adsorptive process. At all levels of adsorbent dosage and effluent concentration, an increase in the system temperature resulted in a rapid removal rate, as depicted in the 2D contour plots of Fig. 7(a–d). Onu et al. (2021a, b), had earlier reported the augmentation of the adsorbate–adsorbent interaction rate and strengthening of the adsorbate ions’ mobility due to temperature increase. This phenomenon explains the observed increase in the removal efficiency upon temperature increase; thus, an endothermic process. Figure 7(b–e) depicts the 3D surface plots for the pH-dosage, and pH-concentration effects, respectively. The figures demonstrated that pH has a prominent quadratic effect on the A–N removal. At any given adsorbent dosage, increasing the pH from 4.0 to 6.8 resulted in a rapid A–N removal rate. However, increasing pH beyond pH 6.8 decreased the A–N removal efficiency (see Fig. 7b and e). Under slightly acidic conditions (pH 4.0–6.8), high amounts of H + competes with the ionized ammonia (NH4+) for available adsorption sites on Fe–CS, thus resulting in a reduced.

removal rate. Similarly, from Eq. (34), it is evident that a large amount of NH4+ was converted to NH3.H2O molecule in an alkaline medium. The production of non-ionized NH3.H2O resulted in adsorption difficulties and reduced removal rate of A–N in alkaline medium.

$${\text{NH}}_{4}^{ + } + {\text{ OH}}^{-} \quad \to \quad {\text{NH}}_{3} .{\text{H}}_{2} {\text{O}}$$
(34)

The effect of initial concentration on the removal efficiency was studied via a variation in the A–N initial concentration from 1.75 to 87.25 mg/L (see supplementary material, Table S1). Figure 7(c–e) shows a continuous reduction in A–N removal efficiency due to a progressive increase in initial concentration. This trend is attributed to the increased occupation of readily available adsorption sites on the surface of Fe–CS. Conversely, the effect of Fe–CS dosage depicted a reverse trend (Fig. 7f), as more active sites were made available upon the Fe–CS dosage increase from 1.0 to 2.2 g. This increment subsequently enhanced the A-N removal efficiency, as demonstrated in Fig. 7f. Similar observations have been reported by other researchers (Ohale et al. 2020; Aniagor et al. 2018; Onu et al. 2021a, b; Okafor et al. 2015).

Process optimization

Four optimization techniques (RSM-GA, ANFIS-GA, ANN-GA, and RSM) were applied for optimizing the selected input variables (pH, dosage, concentration, temperature, and time) used for modeling the adsorptive process. The optimization is aimed at maximizing the A–N removal efficiency. The range of constraints used for genetic algorithm optimization is given in Eq. (35)–Eq. (39), while optimum values for A–N removal efficiency predicted by each method are given in Table 8. The graphical solutions for the RSM-GA, ANFIS-GA and ANN-GA optimization processes are given in Fig. 8 (a-c), respectively. Judging by the figures, the removal efficiency increased steadily in a stepwise order from generation G1G198 for RSM-GA; generation G1G110 for ANFIS-GA; and generation G1G182 for ANN-GA, and subsequently remained constant until termination of the process. Such observations suggest the absence of a probable crossover or mutation, with a substantive optimization effect within the parameters (Betiku et al. 2018, 2016).

Duplicate validation experiments were conducted at the predicted optimum conditions, and the average A–N removal efficiency was calculated and is recorded as actual removal efficiency in Table 8. ANFIS-GA gave the highest A–N removal efficiency prediction of 92.60% (at pH 6.5, 2.2 g, 18.8 mg/L, 317 K, and 156 min). The superiority of ANFIS-GA prediction performance over those of RSM-GA and ANN-GA is linked to their experimental data capturing accuracy. Therefore, regarding the quality and accuracy of the optimized process variables, the observed performance of the optimization techniques followed the trend: ANFIS-GA > RSM-GA > ANN-GA > RSM.

$$5.0 \le {\text{pH}} \le 9.0$$
(35)
$$1.0 \left( g \right) \le {\text{Fe}} - {\text{CS}} {\text{dosage}} \le 2.2 \left( g \right)$$
(36)
$$16 \left( {\frac{{{\text{mg}}}}{L}} \right) \le A - N {\text{concentration}} \le 73\left( {\frac{{{\text{mg}}}}{L}} \right)$$
(37)
$$308\left( K \right) \le {\text{Temperature}} \le 318\left( K \right)$$
(38)
$$60\left( {\min .} \right) \le {\text{time}} \le 180\left( {\min .} \right)$$
(39)

Mechanistic modeling

The adsorption kinetic data interpretation from a mechanistic viewpoint is an important step in describing the sorption process, and the accurate identification of the predominant sorption mechanism is also paramount for design purposes (Ohale et al. 2020). Generally, for an adsorption system, the solute transfer mechanism is typically characterized by either boundary layer diffusion (film) or intraparticle diffusion (pore), or both. Meanwhile, the final adsorption stages are mostly regarded as the equilibrium step, provided the adsorptive.

process was sustained to the termination point (Onu et al. 2021b). The overall mechanism is usually controlled by the slowest occurring step during the adsorption process. Therefore, the effect of the final step (equilibrium stage), which is assumed to be rapid, is considered negligible (Aniagor et al. 2018). The data used in kinetic modeling was obtained by studying the temporal variation of adsorption capacity (qt) with time at different A–N concentrations (See supplementary material, Fig. S2). The different mechanistic models applied in the study (Table 3) were independently discussed in the preceding subsections.

Double exponential model (DEM)

The double exponential model (DEM) describes the mechanism of a sorption process in two-step kinetics. The first phase entails a rapid uptake of adsorbate involving external and internal diffusion. Afterward, a slow step controlled by intraparticle diffusion dominates the sorption mechanism, and finally, the process attains equilibrium. The DEM plot and extracted mechanistic parameters are given in Fig. 9a and Table 9, respectively. From the obtained parameters, the overall kinetic constants (KD1 and KD2) for the rapid and slow steps were relatively identical. This indicates that both film and intraparticle diffusion influenced the adsorption of A–N from AWW using Fe–CS.

Fig. 9
figure 9

Mechanistic model plot for a Double exponential model b Liquid film diffusion c Homogeneous solid diffusion model and d Boyd (Richenberg) model e Weber–Morris model

Table 9 Mechanistic parameters of studied models at varying concentrations of A–N

Weber–Morris intraparticle diffusion model

Weber–Morris plot (qt vs. t1/2) is presented in Fig. 9 (e), while the generated mechanistic parameters are presented in Table 9. Obtained results illustrated that three distinct regions were involved during the sorption mechanism. The first linear region of film diffusion was recorded within the sorption period of 0–90 min. This period was characterized by bulk diffusion of the A–N ions onto the external surface of Fe–CS active sites. The second stage of intraparticle diffusion was recorded within the sorption period of 91–178 min. This second stage was dominated by the distribution of the A–N ions onto the macropore, mesopore and micropores of Fe–CS active sites. The third stage (179–300 min) represents the equilibrium period. The high R2 value (0.92–0.99) depicted by the model denotes a significant influence of both film and pore diffusion in the adsorption process. Similarly, the Weber–Morris adsorption capacity (qipd) was observed to increase with increasing A–N concentration, an indication of increased film diffusion resistance. The improved adsorption capacity is attributed to the presence of high A–N concentration, which provided a favorable driving force to the external mass transfer process (Dotto and Pinto 2012). Meanwhile, none of the trend lines crossed the point of origin (Fig. 9e), thus confirming the fact that both film and intraparticle diffusion contributed significantly to the adsorptive mechanism.

Liquid film diffusion model (LFDM) and Homogeneous solid diffusion model (HSDM)

To estimate the film diffusion coefficient (kf) and the intraparticle diffusion coefficient (Ds), the experimental data of the first region of the Weber–Morris plot was fitted with the liquid film diffusion model, LFDM (Eq. 28), while the experimental data of the second region was fitted with the homogeneous solid diffusion model, HSDM (Eq. 29). The model plots are presented in Fig. 9b and c for LFDM and HSDM, respectively, while the estimated mass transfer coefficients (kf and Ds) and the associated R2 values are presented in Table 9. From the results, it was concluded that LFDM and HSDM produced a good fit (R2 > 0.96) with the tested experimental data, thus illustrating the significance of both film and intraparticle diffusion in the adsorptive mechanism. Furthermore, the kf values decreased with increasing A–N concentrations, an observation that corroborates the earlier obtained result of decreased and increased sorption rate and film diffusion resistance, respectively, due to increased A–N concentration. Conversely, a reverse trend was observed with the HSDM, where intraparticle diffusion coefficient (Ds) was noted to increase with increasing A–N concentration (Fig. 9c). This indicates a reduction in the effect of the intraparticle diffusion mechanism as the A–N concentration increased. Similar observations have been reported by other researchers (Aniagor and Menkiti 2018; Dotto and pinto 2012).

Boyd (Richenberg) model

The contributions of film and intraparticle diffusion have been established in the previous discussion. However, none of the discussed models confirmed the actual rate-controlling step involved in the adsorption of A–N onto Fe–CS. To determine the actual rate-controlling step, experimental data were further analyzed with Boyd kinetic model as done by other researchers (Aniagor and Menkiti 2018; Dotto and Pinto 2012; Tavliev et al. 2013). Boyd model [Eqs. (30) and (31)] and the model parameters obtained from the plot of Bt vs t is presented in Table 9. The linearity of the Boyd plot (Fig. 9d) was applied for determining the rate-controlling mechanism. The plot of Bt vs. t (at 15 mg/L A–N concentration) produced a straight line passing through the origin. Meanwhile, higher A–N concentrations (30 mg/L–75 mg/L) produced straight lines that did not pass through the origin. According to Tavlieva et al. (2013), if the plot of Bt vs. t produces a straight line passing through the origin, it illustrates an intraparticle diffusion mechanism; otherwise, film or external diffusion dominates. Hence, at a low A–N concentration (Conc. ≤ 15 mg/L), intraparticle diffusion dominated the adsorption process, while film diffusion mainly controlled the sorption process at higher concentrations (Conc.\(\ge\)15 mg/L).

Thermodynamics

Thermodynamic studies were used to demonstrate the effect of change in temperature on the adsorption system (Ohale et al. 2020). To illustrate this relationship, thermodynamic parameters such as a change in enthalpy (\(\Delta H^{0}\)), change in Gibbs free energy (\(\Delta G^{0}\)), change in entropy (\(\Delta S^{0}\)), and activation energy (\(E_{A}\)) were calculated using Eqs. (40)–(43).

$$\ln (K_{c} ) = \frac{{\Delta S^{0} }}{R} - \frac{{\Delta H^{0} }}{R} \frac{1}{T}$$
(40)
$$K_{c} = \frac{{q_{e} }}{{C_{e} }}$$
(41)
$$\ln (K_{c} ) = \ln \left( A \right) - \frac{{E_{A} }}{{{\text{RT}}}}$$
(42)
$$(\Delta G^{0} ) = \Delta H^{0} - T\Delta S^{0}$$
(43)

From the plot of \(\ln (K_{c} )\) vs. \(\frac{1}{T}\), \(\Delta H^{0}\) and \(\Delta S^{0}\) were obtained from the slope and intercept, respectively. Calculated values of \(\Delta H^{0}\), \(\Delta G^{0}\), \(\Delta S^{0}\) and \(E_{A}\) are presented in Table 10. Positive \(\Delta H^{0}\) values showed that the adsorption process was endothermic. Also, negative \(\Delta G^{0}\) values indicate that the adsorption of A–N onto Fe–CS was spontaneous at all temperature levels. The reduction in \(\Delta G^{0}\) values with a corresponding increase in temperature depict improved adsorption rate at a higher temperature (Hashem et. al. 2021b).

Table 10 Thermodynamic parameters

The physisorption nature of the system was confirmed by the values of activation energy and enthalpy. According to Ohale et al. (2020) and Onu et al. (2021b), a physisorption process dominates if \(E_{A}\) the range between 0 < \(E_{A}\) < 40 \({\text{kJ}}/{\text{mol}}\) or if \(\Delta H^{0} < 80 {\text{kJ}}/{\text{mol}}\). The values of \(E_{A}\) and \(\Delta H^{0}\) obtained in this work were 3.7454 \({\text{kJ}}/{\text{mol}}\) and 4.1150 \({\text{kJ}}/{\text{mol}}\), respectively, which corroborates a physisorption process. The positive entropy value of 22.9710 \({\text{kJ}}/{\text{mol}}\) indicates minor randomness around the surface of Fe–CS.

Limitations and recommendations for future studies

This study has modeled and optimized the adsorption of ammonia–nitrogen from abattoir wastewater. For accurate design and fabrication of adsorption tower, the results presented in this work are limited to the use of iron-functionalized crab shell as efficient adsorbent for high performance A-N removal. For more comprehensive treatment of AWW, additional developments in this research area are recommended as follows:

Taking into consideration the complex nature of AWW, a more robust optimization route such as multi-objective optimization is needed to include not just A-N, but other pollution control indices that are important for waste reduction from slaughterhouse industry. The GA optimization employed in this study considerably enhanced the efficiency of the A-N removal; however, more optimization algorithm such as particle swarm optimization, vector support mechanism, etc., is recommended.

Conclusion

The present study investigated the predictive accuracy of RSM, ANN, and ANFIS in modeling the adsorptive removal of A–N from AWW using novel Fe–CS prepared from CS. The characterization results established that the properties of CS were improved after chemical and thermal activation. The post-adsorption characterization demonstrated that Fe–CS was very effective in the adsorptive uptake of A–N. Experimental design illustrated the applicability of RSM, ANN and ANFIS in predictive modeling of A–N uptake from AWW. Model comparative analysis using statistical indices showed that the predictive accuracy of the studied models followed the order: ANFIS > RSM > ANN. Process optimization gave optimum values of 92%, 91.58%, 92.6%, and 91.8% for RSM–GA,

ANN–GA, ANFIS–GA and RSM, respectively. Results obtained from mechanistic modeling revealed that intraparticle diffusion dominated the adsorption process at a low concentration of A–N, while film diffusion mainly controlled the sorption process at A–N concentration higher than 15 mg/L. Thermodynamic parameters indicated that the process was spontaneous, physical, and endothermic.