Bioprocess and Biosystems Engineering

, Volume 33, Issue 9, pp 1051–1058

Artificial neural network modelling of a large-scale wastewater treatment plant operation


    • Department of Environmental Engineering, Engineering and Architectural FacultySelcuk University
  • Şükrü Dursun
    • Department of Environmental Engineering, Engineering and Architectural FacultySelcuk University
Original Paper

DOI: 10.1007/s00449-010-0430-x

Cite this article as:
Güçlü, D. & Dursun, Ş. Bioprocess Biosyst Eng (2010) 33: 1051. doi:10.1007/s00449-010-0430-x


Artificial Neural Networks (ANNs), a method of artificial intelligence method, provide effective predictive models for complex processes. Three independent ANN models trained with back-propagation algorithm were developed to predict effluent chemical oxygen demand (COD), suspended solids (SS) and aeration tank mixed liquor suspended solids (MLSS) concentrations of the Ankara central wastewater treatment plant. The appropriate architecture of ANN models was determined through several steps of training and testing of the models. ANN models yielded satisfactory predictions. Results of the root mean square error, mean absolute error and mean absolute percentage error were 3.23, 2.41 mg/L and 5.03% for COD; 1.59, 1.21 mg/L and 17.10% for SS; 52.51, 44.91 mg/L and 3.77% for MLSS, respectively, indicating that the developed model could be efficiently used. The results overall also confirm that ANN modelling approach may have a great implementation potential for simulation, precise performance prediction and process control of wastewater treatment plants.


Activated sludge processModellingArtificial neural networkWastewater treatment plant


Increasingly stringent effluent standards require effective wastewater treatment to meet stringent effluent limitations before discharging into receiving waters. Wastewater treatment has inherently complex and dynamic characteristics. There is a strongly nonlinear relationship between the important process variables. Therefore, operational difficulties in wastewater treatment plants are often encountered. Modelling the treatment process dynamics is providing proper operation and control of wastewater treatment plants (WWTPs). The International Association on Water Pollution Research and Control (IAWPRC) task group developed a mathematical model of the activated sludge process, Activated Sludge Model No. 1 (ASM1), considered as ‘‘state-of-the-art’’ model for carbonaceous substrate removal, nitrification, and denitrification processes [1]. On the other hand, these models still have some weaknesses, which make the model calibration and its practical application complicated. These weaknesses include: (1) an entirely satisfactory description of the cause–effect relationships within activated sludge processes, (2) a large number of kinetic and stoichiometric parameters many of which are difficult to determine and (3) a large number of parameters needed to respecify their values for different conditions of these processes [25].

Alternatively, artificial neural network modelling approaches have received considerable attention in modelling waste water treatment processes, in recent years. Advantageously, a neural network model has distinctive ability of learning non-linear functional relationships. They do not require any prior structural knowledge of the relationships that exist between important variables and processes to be modelled [6]. In recent years, much research was done in wastewater treatment based on ANNs. These researches have been mainly focused on estimation of wastewater process parameters [7], simulation of wastewater treatment performance [810], monitoring [6], controlling [11], classification, [4, 12] and software sensor design [13]. In the literature to date, a limited number of applications have been made to aerobic large scale wastewater treatment plants using ANN model for modelling of a plant operation [14].

The purpose of the present study was to develop the dynamic model that successfully simulates effluent COD, SS and aeration tank MLSS concentrations and monitor the large-scale wastewater treatment plant performance. For this reason, three independent ANN models were applied to the Ankara Central Wastewater Treatment Plant. Differing from the studies in the literature, in this study ANN model was used in dynamic modelling for predicting effluent COD, SS and aeration tank MLSS in a large scale WWTP. The prediction performances of the models under dynamical circumstances have been evaluated and compared using statistical parameters.

Materials and methods

Treatment process

Ankara central wastewater treatment plant is located about 40 km away from the city centre. The plant is designed to treat all domestic and pre-treated industrial wastewaters of Ankara city (Turkey) and based on a biological process with future denitrification and phosphorus removal processes. The plant is designed to treat the wastewater of about 3,900,000 (p.e.) with a dry weather flow of 765,000 m3/day and a storm weather flow of 1,530,000 m3/day. Its capacity will extend to treat a flow rate of 1,380,000 m3/day (dry weather flow), and 2,750,000 m3/day (peak storm weather flow) which will be sufficient until 2025. Influent biochemical oxygen demand (BOD5) concentration of 300 mg/L leaves the plant with max. 30 mg/L BOD5 after treatment and is directly discharged into Ankara Creek. The ACWWTP has a non-nitrifying conventional activated sludge system and includes screening and grit removal, primary clarification, surface aeration tanks, final clarification and sludge treatment facilities. The process scheme of the wastewater treatment plant is presented in Fig. 1.
Fig. 1

Process flow chart of Ankara Central Waste Water Treatment Plant


Input and output parameters were selected or generated from the parameters commonly used in aerobic wastewater treatment system description through literature experience. On-line and off-line operational data of 290 h gathered 2-h intervals during a 12-day intensive sampling campaign for ACWWPT were used to develop separate COD, SS and MLSS models. All samples were analysed for chemical oxygen demand (COD), suspended solids (SS), ammonium nitrogen (NH4-N) and total Kjeldahl nitrogen (TKN) in accordance with the standard methods [15]. Total Kjeldahl nitrogen (TKN) concentrations are calculated from the correlation between the previously measured NH4-N and TKN concentrations. Flow rate (Q), return activated sludge (RAS) and waste activated sludge (WAS) flow rate, dissolved oxygen (DO) concentration were measured on-line with Endress–Hauser analyzers in the wastewater treatment plant. The statistical characteristics of the measured wastewater parameters during campaign period were given in Table 1.
Table 1

The statistical values of measured wastewater parameters during the campaign period [16]





Influent flow rate (m3/day)*




Influent COD concentration (mg/L)




Influent SS concentration (mg/L)




Influent COD loading (kg/day)




Influent TKN concentration (mg/L)




RAS flow rate (m3/day)*




WAS flow rate (m3/day)*




DO concentration in aeration tank (mg/L)




Effluent COD concentration (mg/L)




Effluent SS concentration (mg/L)




* These values represents to modelled lane

The MLP artificial neural network

Artificial neural networks (ANNs) are biologically inspired intelligent techniques. ANNs are generally made up of a number of simple and highly interconnected neurons organized into layers. ANNs have many structures and architectures [17, 18]. Multilayered perceptron artificial neural networks (MLP-ANNs) are the simplest and therefore most commonly used neural network architectures [18]. The back-propagation with momentum (BPM) learning algorithm [19] is used in this study because it is the most commonly adopted MLP-ANN training algorithm. MLP-ANN structure consists of one input layer, one intermediate or hidden layer and one output layer. Each layer can have a number of neurons, which are connected linearly by weights to the neurons in the neighbouring layers.

Neurons in the input layer only act as buffers for distributing the input signals xi to neurons in the hidden layer. Each neuron j in the hidden layer sums up its input signals xi after weighting them with the strengths of the respective connections wji from the input layer and computes its output yj as a function f of the sum:
$$ y_{j} = f\left( {\sum {w_{ji} x_{i} } } \right) $$
where f can be a sigmoid or a hyperbolic tangent function. The output of neurons in the output layer is computed similarly.
Training a network consists of adjusting weights of the network using a learning algorithm. It is a gradient descent algorithm and gives the change Δwji(k) in the weight of a connection between neurons i and j as follows:
$$ \Updelta w_{ji} \left( k \right) = \alpha \delta_{j} x_{i} + \mu \Updelta w_{ji} \left( {k - 1} \right) $$
where xi is the input, α is the learning coefficient, μ is the momentum coefficient, and δj is a factor depending on whether neuron j is an output neuron or a hidden neuron. For output neurons,
$$ \delta_{j} = {\frac{\partial f}{{\partial {\text{net}}_{i} }}}\,(y_{j}^{T} - y_{j} ) $$
where netj = ∑ xiwji and \( {y}_{j}^{T} \) is the target output for neuron j. For hidden neurons,
$$ \delta_{j} = {\frac{\partial f}{{\partial {\text{net}}_{j} }}}\sum\limits_{q} {w_{qj} \delta_{q} } . $$
As there are no target outputs for hidden neurons in Eq. 4, the difference between the target and actual output of a hidden neuron j is replaced by the weighted sum of the δq terms already obtained for neurons q connected to the output of j. Thus, iteratively, beginning with the output layer, the δ term is computed for all neurons in all layers except input layer and weights are then updated according to Eq. 2.

Model implementation

A number of steps were carried out during the model development process. These are shown schematically in Fig. 2.
Fig. 2

A stepwise model development process

Training and testing

Model training and testing is one of the most important components of model development for ACWWTP. The quality of the training data is important in terms of the prediction competence. Measurement campaigns were undertaken in order to train and test the model properly. All the programs used in this study were implemented in MATLAB®. Generally, in the application of ANN, there are three modelling steps: training, validation and testing. The training set is used for adjusting the connection weights; on the other hand, the validation set is used for the determination of network geometry and model parameters. Finally, the testing set is used for testing the optimality and generalisation ability of the model developed. It should be noted that the validation set was not used during the model development process. The correlation coefficient and RMS error values were used as the performance criteria and monitored during training. In the present study training was stopped after at each iteration (epochs), for every candidate model that was tested at each step. Each time training was stopped, the model was tested. A frequently encountered problem in the application of ANN is the risk of over fitting [20]. In order to avoid over fitting in neural network model, the data set consisting of 290 hourly basis sampling was subdivided randomly into two sets: the first set consisting of 240 h observations, to train the neural network model and the remaining data set consisting of 50 h observations to test the model. All variables were normalized and de-normalized between 0 and 1 before and after the actual application in the neural network. This process is carried out by determining the maximum and minimum values of each variable over the whole data period and calculating normalized variables using Eq. 5.
$$ x_{\text{norm}} = {\frac{{x - x_{\min } }}{{x_{\max } - x_{\min } }}}. $$
The inputs fed to the ANN models include influent Q, RAS, WAS, influent concentrations of COD, SS, DO, TKN and COD load. The ANN output was the residuals of effluent concentrations of COD, SS and aeration tank concentrations of MLSS, respectively.

Evaluation of model performance

The performance of each ANN model was evaluated by calculating the RMSE, MAE, and MAPE between the modelled output and measurements of both the training and testing data sets given in Eqs. 68. In addition, the test results satisfying the minimum errors were subjected to correlation coefficient (r) given in Eq. 9. These r values were used as another criterion for the determination of optimum model structure.
$$ {\text{RMSE }} = \, \sqrt {\sum\limits_{i = 1}^{N} {{\frac{{(x_{i} - y_{i} )^{2} }}{N}}} } $$
$$ {\text{MAE}} = \frac{1}{N}\left( {\sum\limits_{i = 1}^{N} {\left| {x_{i} - y_{i} } \right|} } \right) $$
$$ {\text{MAPE }}\left( \% \right) \, = \, \frac{1}{N}\left( {\sum\limits_{i = 1}^{N} {\left[ {\left| {{\frac{{x_{i} - y_{i} }}{{x_{i} }}}} \right|} \right]} } \right) \times 100 $$
$$ r \, = \, {\frac{{\sum {\left( {x - \bar{x}} \right)\left( {y - \bar{y}} \right)} }}{{\sqrt {\sum {\left( {x - \bar{x}} \right)^{2} } \sum {\left( {y - \bar{y}} \right)^{2} } } }}} $$
where xi and yi are the measured and predicted values, \( \bar{x}\,{\text{and}}\,\bar{y} \) are the average of the measured and predicted values, respectively, and N is the total number of model outputs.

Results and discussion

Three independent ANN models were developed to predict effluent COD, SS and aeration tank MLSS concentrations of the ACWWT. Various ANN models were tested by changing the number of neurons in hidden layer between 3 and 100 to determine the optimal structure of neural network. The structure of the network (number of layers, number of hidden nodes, learning rate, momentum values) is optimized during the training phase using gradient descent algorithm, to guarantee good learning and prediction. The increase of hidden neurons did not improve the model any more. The model that resulted in minimum errors was chosen as the best model. When the RMSE is at the minimum and r is high, a model is judged to be very good [21]. The obtained optimum ANN model structure was given on Fig. 3 for the prediction of effluent COD, SS and aeration tank MLSS concentrations. In the configuration of a neural network model one of the most important factors is to determine the number of hidden layers to be used and the number of neurons in the hidden layer. Some investigations [22] showed that one hidden layer was proved to be a universal approximator. The hidden layer has the hyperbolic tangent as activation function and the output layer the sigmoid function. The best suited model of ANN is found as 8:20:1 for COD, 8:23:1 for SS and 8:14:1 for MLSS corresponding node number of input, hidden and output layers, respectively. Then, the results in learning and testing data sets are compared to evaluate the model performance.
Fig. 3

The ANN model structure for prediction of COD, SS and MLSS concentration

Measured and predicted data are given in Figs. 4, 5, 6 to reinforce the model performance for training and testing purposes. Results of COD model for training and test data set are illustrated graphically in Fig. 4. ANN models predict the dynamic behaviour of the COD concentrations with good accuracy and provided a very good fit to the training data and testing data. Artificial neural network model shows that the peak values in the plot of the measured concentrations can be well reproduced by the predictions. As can be seen in Fig. 5, the differences between the predicted and measured data corresponding to the effluent SS values were higher compared to the other two models, but rather reflect the fluctuations in measured values as demonstrated on same figure. Here a poor fit between the simulated and observed total SS is apparent. It could be the reason that effluent SS concentration is sensitive to changes in surface loading rate applied to final settling tank; similar observation is also highlighted by Cote et al. [23] in their investigation. The same simulation procedure as that adopted for the effluent COD and SS, concentrations was applied to develop model for the aeration tanks MLSS concentration. The simulation performance was as good as the previous COD model. As shown in Fig. 6, simulated MLSS values are able to reflect the fluctuations and peak concentrations in measured values. Visual inspection indicates that the ANN models resulted in a good fit for the measured COD, SS and MLSS data.
Fig. 4

Time plots of measured and predicted effluent COD concentration for training and testing set
Fig. 5

Time plots of measured and predicted effluent SS concentration for training and testing set
Fig. 6

Time plots of measured and predicted effluent MLSS concentration for training and testing set

Model results can be compared using error analyses based on the deviations between predicted values and original observations. Table 2 summarises performance statistics of the trained and tested models and regression coefficients for the same data set. As shown in Table 2, in the aspect of model training, RMSEs, MAEs and MAPEs between the measured and predicted values were 2.72, 2.08 mg/L and 4.48% for COD model, 0.71, 0.49 mg/L and 6.98% for SS and 38.39, 30.23 mg/L and 2.58% for MLSS model, respectively. In the aspect of model testing, the RMSEs, MAEs and MAPEs were 3.23, 2.41 mg/L and 5.03% for COD model, 1.59, 1.21 mg/L and 17.10% for SS and 52.51, 44.91 mg/L and 3.77% for MLSS model, respectively. According to error analysis, COD, SS and MLSS models gave the prediction performance with the high levels of accuracy.
Table 2

ANN model performance statistics calculated over the whole of training and testing data


Effluent COD

Effluent SS















RMSE (mg/L)







MAE (mg/L)







MAPE (%)







Correlation coefficient (r) analysis was also performed and the results of these analyses for both training and testing data were calculated as 0.87 and 0.85 for COD model, respectively. Results are shown in Fig. 7. The perfect agreement is achieved when r is equal to 1.0. The correlation coefficients show good correspondence between measurement and predictions. Furthermore, consistent with other error criteria results, r values of SS model were obtained 0.94 and 0.68 for training and testing data set. On the other hand, the high r results obtained for the MLSS are likely the result of COD model. Scatter plots of measured MLSS concentrations against predictions of ANN model are also presented in Fig. 7.
Fig. 7

Scatter plots of measured and predicted effluent COD, SS and aeration tank MLSS concentrations for training (a, c, e) and testing (b, d, f) data sets

Comparable observations were similarly made by Pai et al. [24] compared different types of model by which the effluent from an industrial park WWTP was predicted using ANN model. They found that the MAPEs lay between 4.49–7.24% and 4.20–14.36 for COD and SS, respectively. In another study, Pai et al. [25] employed ANN to predict COD and SS in the effluent from sequence batch reactors for a hospital wastewater treatment plant (HWWTP). They found the prediction accuracy as 48.22 and 19.81% for COD and SS. Other studies applying ANN modelling method to a full-scale wastewater treatment plant estimated correlation coefficient (R-square) values ranging from 0.63 to 0.81 for BOD, and from 0.45 to 0.65 for SS in plant effluent [14]. In the study proposed by Gontarski et al. [26], ANN was used to predict the effluent total organic carbon (TOC) from an IWWTP. Since TOC was the single output layer variable, high fitness was achieved. In this study, the minimum RMSEs, MAEs and MAPEs of 3.23, 2.41 mg/L and 5.03% for COD 1.59, 1.21 mg/L and 17.10% for SS, 52.51, 44.91 mg/L and 3.77% for MLSS could be achieved, respectively. The highest r results of 0.85, 0.68 and 0.88 for COD, SS and MLSS were determined based on testing data set. Although it is beyond the scope of this study, the results of ANN model were compared with the ones of Güclü et al. [27]. It was observed that the ANN alone performs a bit worse than ASM1 ANN hybrid model for COD on the same data set.


Three independent ANN models were developed to predict the effluent COD, SS and aeration tank MLSS concentrations for a large-scale municipal wastewater treatment plant. The prediction performance of each model was evaluated and compared. It could be found that the ANN trained by 240-h data sets gave reliable predictions to estimate the other 50-h data sets. The simulation results can be drawn as follows:
  • The error values of COD model were smaller than those of SS and MLSS models. The maximum r values of 0.85, 0.68 and 0.88 for COD, SS, and MLSS could be achieved using ANN. The minimum MAPE values for COD, SS, and MLSS were 5.03, 17.10, and 3.77%, respectively. The minimum RMSEs of 3.23, 1.59, and 52.5100, and the minimum MAEs of 2.41, 1.21, and 44.91 for COD, SS, and MLSS could also be achieved.

  • By considering the high level of complexity in biological processes, the broadness of the data range and the computed other error values, it is seen that it provides successful results in prediction of target output parameters. The ANN-based models were found to be a viable alternative for more precise performance prediction and monitoring of treatment process dynamics.

  • There is a great deal of potential using it as a general modelling tool, especially suitable for a variety other treatment process systems which are difficult to solve with conventional activated sludge modelling techniques.

  • Models with more data using on-line process variables should be investigated. This would have resulted in an improved prediction capability and increased applicability of the model in prediction, monitoring and in advanced control of the wastewater treatment systems.


This paper has been derived from a part of PhD thesis of Dünyamin Güçlü. Authors would like to thank Ankara Water and Sewerage Administration (ASKI) General Directorate (Turkey) for their help during the experimental study and in providing WWTP process data. This study was supported by the Selcuk University Research Fund (BAP) with Project No: 2005-101018.

Copyright information

© Springer-Verlag 2010