Introduction

In Egypt, industrial wastewater is considered the main source of pollution that leads to serious environmental problems (Nasr et al. 2015). Unfortunately, more than 350 factories are discharging their industrial wastewater directly, or with no appropriate treatment, into the Nile. In Alexandria, there are about 1250 treatment plants that discharge industrial wastewater into the sea via Lake Marriott (Abd El-Salam and El-Naggar 2010). Those wastes are loaded with heavy metals, such as cadmium, chromium, copper, nickel, arsenic, lead and zinc, which are the most hazardous among the chemical-intensive industries. These heavy metals pose serious health hazard, including cancer, organ damage, disorders of nervous system, and in extreme cases, death (Sud et al. 2008). As a result, about 60 % of the treatment plants are contributing to marine pollution of the Mediterranean coast of Alexandria. Accordingly, rules for controlling industrial wastewater disposal containing heavy metals have been tightened. From those regulations, Law 4/94 (discharge to coastal environment), Law 93/62 as modified by Decree 44/2000 (discharge to sewer system) and Law 48/82 (discharge into Nile, Nile branches/canals and drains) (Nasr and Zahran 2015).

Among heavy metals, Cd2+ has been recognized as one of the most toxic, teratogenic and carcinogenic species (Cui et al. 2008). The major sources of Cd2+ released into the environment are phosphate fertilizers, waste batteries, electroplating industry, paint pigments, smelting and alloy manufacturing (Garg 2008). Once accumulated in the body, Cd2+ is stored mainly in the bone, liver, and kidneys causing renal damage, pulmonary insufficiency and negative effects on blood. Infants and children who drink water containing large amounts of Cd2+ may suffer stomach irritation, vomiting, and diarrhea (Singh et al. 2005). Accordingly, the drinking water guideline value recommended by the World Health Organization and Egyptian standards for Cd2+ is 0.005 mg/L (Wang et al. 2013). Thus, it is necessary to treat Cd2+-contaminated wastewater prior to its discharge to the environment. Removal of Cd2+ can be achieved by conventional treatment processes such as chemical precipitation, ion exchange, electrochemical removal and reverse osmosis. However, these processes have some disadvantages such as being too expensive, incomplete removal, high-energy requirements, and production of toxic sludge (Sud et al. 2008).

Among the novel approaches developed for Cd2+ removal is biosorption (Rao and Kashifuddin 2014). Biosorption is a process that utilizes inexpensive biomass to sequester toxic heavy metals from water (Mishra 2014). The biosorbent term refers to material derived from microbial biomass, seaweed or plants that exhibit adsorptive property, so that materials of natural origin (such as agricultural wastes) are generally used in biosorption studies (Kaur et al. 2013). Particularly, biosorption is convenient for the elimination of contaminants from industrial effluents. This process is considered as a safe, cost-effective and eco-friendly technology for Cd2+ deduction. Due to their relative abundance with low commercial cost, natural biomaterials (especially crop wastes) can be used as a promising biosorbent (Hussein 2013). From these crops, rice is one of the most commonly grown and consumed grain in Egypt. Rice straw is produced in large quantities as a waste, where three million tons of rice straw is burned directly in the field generating several environmental problems. Moreover, rice straw contains suitable functional group for effective adsorption capacity (Gao 2008). Accordingly, the abundance and availability of rice straw make them a suitable source as a natural biosorbent.

The major factors affecting the biosorption processes are biomass dose, hydrogen ion (pH) and metal concentration (Garg 2008). Most of the previous studies tested the effect of one-factor-at-a-time; however, few studies examined three parameters and their interaction on the biosorption process. This interaction can be carried out using several techniques, such as the factorial experimental design. This design is employed to define the most important factors affecting the metal removal efficiency, as well as how the effect of one factor relates to others. Additionally, artificial intelligence (AI) techniques (e.g., artificial neural network (Nasr et al. 2012) and fuzzy logic (Nasr et al. 2014a) can be efficiently implemented to predict metal removal from input factors. Moreover, AI is capable of depicting the interactive influence between variables, as well as their correlation with the simulation output. Methods of AI are capable of identifying complex and highly nonlinear relationships between several parameters and variables. Hence, modeling based on AI needs a little knowledge about the process; i.e., it uses only sets of input/output data. Accordingly, the mathematical methods are helpful as they can reduce time and overall research cost (Krishna and Sree 2014).

In this work, rice straw was investigated for the removal of Cd2+ ions from aqueous solution. The effects of biomass dose, hydrogen ion concentration (pH) and metal levels on the removal of Cd2+ ions were studied. The resulted experimental data were evaluated and fitted using factorial experimental design (23) and AI models.

Materials and methods

Preparation of the biosorbent

Rice straw (Oryza sativa) was used as biosorbent for removal of Cd2+ from aqueous solution. The biosorbent was collected from agro-ecosystems of Kafr El-Dawar, situated in Northern Egypt. Plant samples were washed with double distilled water to remove dust or foreign particles, and then dried at 60 °C for 24 h. To obtain homogeneous powders, samples were finely ground using a stainless steel grinder, and then graded by a stainless steel sieve with a mesh size smaller than 0.5 mm. The biosorbent was characterized by Fourier transform infrared (FT-IR) spectroscopy (FT-IR Bruker, TENSOR 37) operated in the range 4000–400 cm−1.

Preparation of stock solution

The Cd2+ ion solution was prepared from analytical grade stock standard of concentration 1000 mg/L. The initial concentrations of Cd2+ were obtained by dilution of the stock solution using double distilled water. Exact desired concentrations of the prepared solutions were determined by inductively coupled plasma (ICP-AES). Values of pH were adjusted by using 0.1 M HNO3 and/or 0.1 M NaOH.

Experimental setup

Eight batch experiments were conducted in duplicate at constant room temperature (25 ± 4 °C), particle size of the biosorbents (<0.5 mm), contact time (2 h) and agitation rate (5 Hz). Three factors were manipulated at discrete “low” and “high” values as follows: biosorbent doses, in g/L (0.1, 0.5), pH (2, 7), and initial Cd2+ concentration, in mg/L (10, 100). Biosorbent dose and Cd2+ concentration were chosen based on literature survey (Cui et al. 2008; Rocha et al. 2009; Ding et al. 2012). However, values of pH were selected at 2 to simulate heavily polluted wastewater and pH 7 for studying wastewater under normal environmental conditions. The influence of those factors on the removal of Cd2+ ions was evaluated and optimized by a 23 full factorial experimental design. The design matrix and interaction of the three studied factors using, low level −1 and high level +1 is listed in Table 1.

Table 1 A 23 two-level, full factorial design showing runs in standard order

Artificial neural network application

A neural network is an interconnected assembly of units or nodes known as artificial neurons, which simulates the function of a human nervous system. In the current study, a one-layer network with three input elements and six neurons was configured. Each element of the 3-length input vector (\(P_{3\; \times \;1}\)) is connected to each neuron input through a 6 × 3 weight matrix (\(W_{6\; \times \;3}\)). The inputs are weighted and summed up (\(\sum {W_{6\; \times \;3} } P_{3\; \times \;1}\)), and then a 6-length bias (\(b_{6\; \times \;1}\)) is added. The resulted net input (\(u_{6\; \times \;1} = \sum {W_{6\; \times \;3} } \;P_{3\; \times \;1} \; + \;b_{6\; \times \;1}\)) is transformed in a linear or nonlinear manner through transfer functions. The hyperbolic tangent transfer function (Eq. 1) squashes the output into the range −1 to 1 as the neuron’s net input goes from negative to positive infinity. This transfer function is used for pattern recognition problems (in which a decision is being made by the network). However, the linear transfer function (Eq. 2) can be efficiently used in the last layer for function fitting:

$$f\left( x \right) = \frac{{e^{x} \; - \;e^{ - x} }}{{e^{x} \; + \;e^{ - x} }}, - 1\; \le \,f\left( x \right)\; \le \,1$$
(1)
$$f\left( x \right)\; = \;x, - \infty \; < \;f\left( x \right)\; < \; + \infty .$$
(2)

The input layer receives data from an experimental source and transfers them to the network for handling. A hidden layer receives information from the input layer and generates processed information to an output layer. The output layer receives all data from the network and sends the predicted results to an external receiver. The output is obtained by performing one of the transfer functions on the net input. The target is compared with the output by calculating mean square error (MSE) value (Nasr et al. 2012). The error is propagated back from the output layer to the input layer, so that the values of the weights and biases are tuned accordingly until the number of iterations is determined. The MSE is calculated from Eq. 3:

$$MSE = \frac{{\sum\nolimits_{i = 1}^{N} {\left( {t_{i} - a_{i} } \right)^{2} } }}{N},$$
(3)

where \(t_{i}\) and \(a_{i}\) are target and predicted outputs, respectively, and N is the number of points.

Adaptive neuro-fuzzy inference system application

Adaptive neuro-fuzzy inference system (ANFIS) is a successful tool that combines artificial neural networks and fuzzy logic into an integrated approach. The integrated system has the advantages of both neural networks (i.e., learning, adapting and optimization) and fuzzy systems such as human-like “if–then” rule reasoning, and the readiness to incorporate expert knowledge. For a first-order Takagi–Sugeno fuzzy model, the ANFIS composed of two inputs x and y has two fuzzy “if–then” rules as the following:

Rule-1: If x is A1 and y is B1, then f 1 = p 1 x + q 1 y + r 1.

Rule-2: If x is A2 and y is B2, then f 2 = p 2 x + q 2 y + r 2.

As shown in Fig. 1, the system has a total of five layers, where the functioning of each layer is described as follows:

Fig. 1
figure 1

Typical first-order Sugeno ANFIS architecture

Layer-1 (Input node): Every single node in this layer generates a membership grade of linguistic label. Parameters in this layer are referred to “premise parameters”. The membership function of Ai (Eq. 4) and Bi-2 (Eq. 5) would be:

$$O_{i}^{1} = \mu A_{i} \left( x \right) = \frac{1}{{1 + \left| {\frac{{x - c_{i} }}{{a_{i} }}} \right|^{{2b_{i} }} }}\,\quad{\text{for}}\;{\text{i =1, 2, or}}$$
(4)

For identification, outputs

$$O_{i}^{1} = \mu B_{i - 2} \left( y \right)\,\quad{\text{for i = 3, 4}}$$
(5)

where: x (or y) is the input to node i, and A i (or B i2 ) is the linguistic label (small, large, etc.) related to this node; and ai, bi, and ci are the parameters set that govern the shapes of the membership function.

Layer-2 (Rule nodes): In this layer, the AND/OR operator is applied to get one output that represents the antecedent of the fuzzy “if–then” rule. The output of every node represents a firing strength, where each node analyzes the firing strength by cross multiplying all the incoming signals (Eq. 6):

$$O_{i}^{2} = w_{i} = \mu A_{i} \left( x \right) \times \mu B_{i} \left( y \right)\,\quad i = 1 , { 2} .$$
(6)

Layer-3 (Average nodes): The ith node of this layer calculates the ratio of the ith rules firing strength to the sum of all rules’ firing strengths. For identification, outputs of this layer are called “normalized firing strengths” (Eq. 7):

$$O_{i}^{3} = \overline{{w_{i} }} = \frac{{w_{i} }}{{w_{1} + w_{2} }}\quad i = 1 , { 2} .$$
(7)

Layer-4 (Consequent nodes): Every node i in this layer is an adaptive node with a node function (Eq. 8). Parameters in this layer are known as “consequent parameters”.

$$O_{i}^{4} = \overline{{w_{i} }} f_{i} = \overline{{w_{i} }} \left( {p_{i} x + q_{i} y + r_{i} } \right),$$
(8)

where \(\overline{{w_{i} }}\) is the output of layer-3, and {p i , q i , r i } is the parameter set of this node.

Layer-5 (Output node): The single node computes the overall output as the summation of all incoming signals as expressed by Eq. 9.

$$O_{i}^{5} = \sum\limits_{i}^{n} {\overline{{w_{i} }} f_{i} } = \frac{{\sum\nolimits_{i} {w_{i} f_{i} } }}{{\sum\nolimits_{i} {w_{i} } }}.$$
(9)

Results and discussion

Biosorbent characterization

The FT-IR spectra of the rice straw before and after Cd2+ biosorption are shown in Fig. 2. The appearance of predominant broad and strong peaks at 3436.89 cm−1 can be attributed to the presence of hydroxyl groups (–OH) in alcohol and/or phenol. However, the appearance of less prominent bands at 2920 cm−1 is due to stretching vibrations of aliphatic acids at –CH3 group. Moreover, the band that appeared at 1638.18 cm−1 represents C=O stretching of carbonyl group. Also, the peak displayed at 1321.58 cm−1 is possibly due to carboxylate group (–COO) stretching. After Cd2+ biosorption, shifts in the position and intensities of FT-IR bands were observed. This indicates that the functional groups present on the rice straw were involved in interaction with Cd2+ ions. This implies that the biosorbent’s functional groups and metal ions may have undergone a chemical reaction

Fig. 2
figure 2

FT-IR spectra of rice straw (R) before and after Cd2+ biosorption

Biosorption results and factorial design application

As displayed in Fig. 3a variation in initial Cd2+ ion concentration (X) from high to low level resulted in 31.00 % increase in the removal efficiency. On the contrary, a variation in pH and biomass dose (BD) from low to high level resulted in 28.60 and 23.61 % increase in the Cd2+ ion removal efficiency, respectively. The results indicated that, metal ion concentration has reverse effect on Cd2+ biosorption. This is due to, at lower concentrations, the ratio of active adsorption sites to the initial metal ions is larger, resulting in higher removal efficiency (Ding et al. 2012). However, with increasing metal ion concentration, the functional groups on biomass surface could be saturated, and there were a few available active sites on the biomass surface (Kaur et al. 2013). As a result, at higher metal concentrations, the metal ions would compete for the available binding sites. On the other hand, pH has direct effect on Cd2+ biosorption. The enhancement of metal removal with increase in pH can be illustrated by a decrease in competition between proton and the metal cations for the same functional groups. Additionally, the decrease in positive charge of the adsorbent results in a lower electrostatic repulsion between the metal cations and the surface (Adekola et al. 2014). Arief et al. (2008) explained this finding by the fact that when the concentration of H+ ions is high, Cd2+ ions must compete with H+ ions in order to attach to the surface functional groups of the agricultural wastes. Also, they found that when the pH value rise, fewer H+ ions exist, and consequently, Cd2+ ions have a better chance to bind at free binding sites. Additionally, at acidic pH value of 2, the cellulose, hemicellulose and lignin of the rice straw (adsorbent) might be loosened and converted to glucose, which would contribute to a decline in the adsorption efficiency. Similarly, BD has direct impact on Cd2+ biosorption because the number of binding sites available for adsorption on the biosorbents is determined by BD in the aqueous solutions. At low biosorbent dose (limited number of active sites), all biosorbents would have become saturated above a certain metal concentration (Alalm et al. 2015). However, an increase in the BD generally elevates the amount of solute biosorbed, due to enhancement of the biosorbent surface area, which in turn increases the number of binding sites.

Fig. 3
figure 3

Main effects plot for Cd2+ removal by rice straw

Three factors (BD, pH and X), and each factor has two levels (“−1” and “+1”), namely the full factorial design 23 was considered. The design allows studying the effect of each factor, as well as the effects of interactions between them on the response variable (Saadat and Karimi-Jashni 2011). The design tests three main effects: BD, pH and X; and two-factor interaction effects: BD × pH, BD × X and pH × X. From the Prob > F column (in Table 2), the main effects of BD, pH and X achieved p value equals to 0.2248, 0.1881 and 0.1742, respectively. Thus, the influences of environmental factors on Cd2+ biosorption are in the following order X > pH > BD. However, the interaction between BD × pH, BD × X and pH × X resulted in p value of 0.8539, 0.7656 and 0.8255, respectively. The high p values concluded that there was no statistically significant interaction between the two factors on the Cd2+ biosorption.

Table 2 Factorial design results of Cd2+ biosorption by rice straw using three factors (BD, pH and X), and each factor has two levels (“−1” and “+1”)

Those results were comparable to previous studies. For example, Ding et al. (2012) investigated the removal of Cd2+ from large-scale effluent contaminated by heavy metals. The study found that rice straw, as a biosorbent, exhibited a short biosorption equilibrium time of 5 min, high biosorption capacity (13.9 mg/g) and high removal efficiency at a pH range of 2.0–6.0. Moreover, Rocha et al. (2009) carried out experiments using waste rice straw as a biosorbent to adsorb Cd(II) ions from aqueous solutions at room temperature. The results found that a quick adsorption process reached the equilibrium before 1.5 h, with maximum efficiencies at pH 5.0. Additionally, Muhamad et al. (2010) investigated the effect of pH and temperature on Cd2+ removal by wheat straw. The results revealed that, by increasing the temperature from 20 to 40 °C at an initial concentration 100 mg/L, the Cd2+ uptake increased from 12.2 to 15.7 mg/g. Moreover, by increasing the solution pH from 3.0 to 7.0, adsorption capacity elevated from 2.7 to 14.4 mg/g.

Artificial neural network application

As displayed in Fig. 4 an ANN with a structure 3–6–1 was generated to predict the removal efficiency of Cd2+ ions from aqueous solution using three inputs: BD, pH and metal ion concentration (X). 43 experimental sets were processed to cover wide range of inputs for which the network was used. The input matrix consists of 43-column vectors of 3-real estate variables, and the target matrix consists of the corresponding 43-relative valuations. The network used the Levenberg–Marquardt method (trainlm) for training, which is applicable for small and medium-size networks. Input and target vectors were randomly divided into three sets. The first 60 % are used for training, where the gradient is computed while updating weights and biases. The second 20 % are used for validation to stop training before overfitting, while the last 20 % are used as a completely independent test of network generalization (Nasr et al. 2014b). The actual number of hidden neurons was estimated by trial and error.

Fig. 4
figure 4

Neural network structure for Levenberg–Marquardt algorithm composed of three inputs, one hidden layer with six neurons and one output layer (notation: 3–6–1)

The plot in Fig. 5 shows the progress of training variables, such as the magnitude of the gradient of performance and the number of validation checks. The training will terminate if the magnitude of the gradient is less than 1e-5, or if number of validation checks (which represents the number of successive iterations that the validation performance fails to decrease) reaches 6 (Nasr and Zahran 2014). In the current study, the magnitude of the gradient of performance was equal to 22.52, and the validation checks were equal to 6. This indicates that the training stopped because of the number of validation checks. Moreover, the network training can be stopped at other criteria such as: maximum training time, minimum performance value and/or maximum number of training epochs (iterations).

Fig. 5
figure 5

The network state plot trained for 6 epochs (iterations). The time consumed to complete the training progress was 0:00:08 using ‘‘nntool’’ neural network toolbox_ in MATLAB_ R2014a; PC memory: 2.00 GB RAM

The plot in Fig. 5 shows the value of the performance function, i.e., MSE, versus the iteration number. The best validation performance was 92.43 at epoch 0. After epoch 0, the MSE of training continued to descend gradually until epoch 6. This trend does not guarantee any major problems with the training. Moreover, after epoch 0, the error on the validation set typically begins to rise, indicating that the network begins to overfit the data. In general, the error for the validating data set tends to decrease as the training takes place up to the point that overfitting begins. At this point the model error for the validating data suddenly increases. In overfitting, all training points are well fitted, but the fitting curve oscillates wildly between these points. The test curve had increased as the validation curve increased, which reveals that the validation and test curves are very similar.

During training, the adjustable network parameters, i.e., weights and biases, were tuned until the network output matches the target. For example, if the input is very large, then the weight must be very small in order to prevent the transfer function from becoming saturated. This procedure is used for increasing and optimizing the network performance. Those parameters were estimated as:

Weight from inputs to the hidden layer.

$$\left[ {\begin{array}{*{20}c} { 1. 4 2} & { 0. 4 7} & { 2. 9 7} \\ { 3. 2 8} & { 1. 2 6} & { 0. 4 2} \\ { - 1. 3 8} & { 1. 8 1} & { 0. 9 1 3} \\ { 1. 3 9} & { 2. 0 3} & { - 0. 8 5} \\ { 2. 1 9} & { - 1. 6 0} & { - 1. 1 3} \\ { - 2. 2 7} & { 1. 8 7} & { 0. 2 4} \\ \end{array} } \right]$$

Weight from the hidden layer to the output.

$$\left[ {\begin{array}{*{20}c} { - 0. 3 7} & { 0. 3 5} & { 0. 2 3} & { 0. 1 9} & { 0. 2 7} & { 0. 2 3} \\ \end{array} } \right]$$

Bias to the hidden layer.

$$\left[ {\begin{array}{*{20}c} { - 1. 3 5} \\ { - 0. 2 8} \\ { 1. 4 8} \\ { 1. 2 2} \\ { 0. 8 2} \\ { - 2. 4 4} \\ \end{array} } \right]$$

Bias to the output.

$$\left[ { - 0. 2 5} \right]$$

The regression plot in Fig. 6 shows the correlation between network outputs and network targets. The dashed line represents the perfect result, where outputs equal to targets, while the solid line represents the best fit linear regression. The training and test plots indicate a good fit with R value greater than 0.9. The validation plot shows R values of 0.82, indicating that certain data points have poor fits. In order to increase the R value, it is suggested to try the training again, increase the number of hidden neurons and/or use a different training function. After the network is trained, validated and tested, the network structure can be used to calculate the network response to new input. Similar results were observed by Witek-Krowiak et al. (2014), who presented a review on the application of ANN in biosorption modeling and optimization.

Fig. 6
figure 6

The regression plot of the data points used in the training, validation and test sets

Adaptive neuro-fuzzy inference system

Biosorption of Cd+2 is a nonlinear regression problem, in which several attributes are used to predict an output. The three input attributes are BD, pH and metal ion concentration (X). The output variable to be predicted is the Cd2+ biosorption. The data set is partitioned into a training argument (70 %) and a checking set (30 %). The training process stops if the designated epoch number is reached or the error goal is achieved, whichever comes first. The checking is used for testing the generalization capability of the fuzzy inference system, and sees how well the model predicts the corresponding data set output values. The function exhsrch in MATLAB performs an exhaustive search within the available inputs to determine the one most influential input attribute in predicting the Cd+2 biosorption. Generally, exhsrch builds an ANFIS model for each input combinations, trains it for one epoch and then reports the performance achieved.

As displayed in Fig. 7, the left-most input variable has the least error, i.e., the most relevance with respect to the output. The plot and results from the function clearly indicate that the input attribute X is the most influential with training and checking errors of 10.87 and 17.94, respectively. This trend is followed by “pH” with training error = 15.80 and checking error equals to 17.39, after that BD with training error: 16.09 and checking error of 16.29. This observation is similar to that previously observed by the full factorial design 23. These results indicated that, in order to improve the biosorption performance, it is suggested to principally control and optimize the initial Cd2+ ion concentration.

Fig. 7
figure 7

Every input variable’s influence on Cd2+ biosorption

Conclusion

This study successfully demonstrated the effect of three factors, biosorbent dose BD, pH and initial metal concentration X on Cd2+ biosorption using rice straw. The 23 factorial design and adaptive neuro-fuzzy inference system indicated that the influences are in the order X > pH > BD. A proposed neural network (3-6-1) was capable of predicting Cd2+ biosorption with high accuracy (overall R > 0.9). The obtained results could be beneficially applied for designing a full-scale biosorption unit subjected to industrial effluents containing Cd2+ ions. However, the current study did not investigate Cd2+ removal in continuous-mode experiments; and that will be the focus of our future studies.