Inference model derivation with a pattern analysis for predicting the risk of microbial pollution in a sewer system

  • Yoon-Seok Timothy HongEmail author
  • Byeong-Cheon Paik
Original Paper


Developing a mathematical model for predicting fecal coliform bacteria concentration is very important because it can provide a basis for water quality management decisions that can minimize microbial pollution risk to the public. This paper introduces a hybrid modeling methodology which is a combined use of a neural network-based pattern analysis and an evolutionary process model induction system. The neural network-based pattern analysis technique is applied to extract knowledge on inter-relationships between fecal coliform concentrations and other measurable variables in a sewer system. Based on the result of neural network-based pattern analysis, an evolutionary process model induction system is used to derive mathematical inference models that can predict fecal coliform bacteria concentration from easily measurable variables instead of directly measuring fecal coliform bacteria concentration in a sewer system. The neural network-based pattern analysis extracts that temperature and ammonia concentration are the most important driving forces leading to an increase in fecal coliform bacteria concentration in the sewer system at Paraparaumu City, New Zealand. Fecal coliform bacteria concentration is also positively correlated with dissolved phosphorus and inversely with flow rate. The multivariate inference models that are able to predict fecal coliform bacteria concentration are successfully derived as functions of flow rate, temperature, ammonia, and dissolved phosphorus in the form of understandable mathematical formulae using the evolutionary process model induction system, even if a priori mathematical knowledge of the dynamic nature of fecal coliform bacteria is poor. The multivariate inference models evolved by the evolutionary process model induction system produce a slightly better performance than the multi-layer perceptron neural network model.


Fecal coliform bacteria Water quality modeling Multivariate inference model derivation Neural network-based pattern analysis Self-Organising Feature Maps Evolutionary process model induction system Grammar-based genetic programming 


  1. Ambrose RB, Wool TA, Martin JL (1993) The Water Quality Analysis Simulation Program, WASP5, Part A: Model Documentation. Environmental Research Laboratory, US EPA, Athens, Georgia, 1209Google Scholar
  2. Brion GM, Lingireddy S (1999) A neural network approach to identify non-point sources of microbial contamination. Water Res 33(14):3099–4106CrossRefGoogle Scholar
  3. Canale R, Auer M, Owens E, Heidtke T, Effler S (1993) Modelling fecal coliform bacteria II. Model development and application. Water Res 27:703–714CrossRefGoogle Scholar
  4. Chandramouli V, Brion G, Neelakantan TR, Lingireddy S (2007) Backfilling missing microbial concentrations in a riverine database using artificial neural networks. Water Res 41:217–227CrossRefGoogle Scholar
  5. Chomsky N (1986) Knowledge of language: its nature, origin and use. Preager Publishers, New YorkGoogle Scholar
  6. Christensen V, Jian X, Ziegler A (2000) Regression analysis and real-time water quality monitoring to estimate constituent concentrations, loads and yields in the Little Arkansas River. South Central Kansas, 1995–1999. USGS Water Resources Investigation Report 00-4126, Lawrence, KansasGoogle Scholar
  7. Collins R, Rutherford K (2004) Modelling bacterial water quality in streams draining pastoral land. Water Res 38:700–712CrossRefGoogle Scholar
  8. Eleria A, Vogel R (2005) Predicting fecal coliform bacteria levels in the Charles River, Massachusetts, 2005, USA. Journal of the American Water Resources Association, paper No. 03111, 1195–1209Google Scholar
  9. Fraser R, Barten P, Pinney D (1998) Predicting stream pathogen loading from livestock using a geographical information systems-based delivery model. J Environ Qual 27:935–945CrossRefGoogle Scholar
  10. Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, ReadingGoogle Scholar
  11. Haykin S (1994) Neural networks: a comprehensive foundation. Prentice Hall, Upper Saddle RiverGoogle Scholar
  12. Hong Y-S (2003) Automatic model induction of a biological wastewater treatment process using a context-free grammar genetic programming. Genetic and Evolutionary Computation Conference 2003, Grammatical Evolution Workshop. American Association for Artificial Intelligence, Chicago, Illinois, USAGoogle Scholar
  13. Hong Y-S, Bhamidimarri SMR (2003) Evolutionary self-organising modelling of a municipal wastewater treatment plant. Water Res 37(6):1199–1212CrossRefGoogle Scholar
  14. Hong Y-ST, Paik BC (2007) Evolutionary multivariate dynamic process model induction for a biological nutrient removal process. J Environ Eng 133(12):1126–1135CrossRefGoogle Scholar
  15. Hong Y-ST, Rosen MR, Bhamidimarri SMR (2003) Analysis of a municipal wastewater treatment plant using a neural network-based pattern analysis. Water Res 37(7):1608–1618CrossRefGoogle Scholar
  16. Hong Y-ST, White PA, Scott DM (2005) Automatic rainfall recharge model induction by evolutionary computational intelligence. Water Resour Res 41(W08422). doi: 10.1029/2004WR003537
  17. Jamieson R, Gordon R, Joy D, Lee H (2004) Assessing microbial pollution of rural surface waters: a review of current watershed scale modeling approaches. Agric Water Manage 15:1–17CrossRefGoogle Scholar
  18. Kelsey H, Porter DE, Scott G, Neet M, White D (2004) Using geographic information systems and regression analysis to evaluate relationships between land use and fecal coliform bacterial pollution. J Exp Mar Biol Ecol 298:197–209CrossRefGoogle Scholar
  19. Kohonen T (1995) Self-organizing maps. Springer, BerlinCrossRefGoogle Scholar
  20. Koza JR (1992) Genetic programming: on the programming of computers by natural selection. MIT Press, CambridgeGoogle Scholar
  21. Lischeid G (2009) Non-linear visualization and analysis of large water quality data sets: a model-free basis for efficient monitoring and risk assessment. Stoch Environ Res Risk Assess 23:977–990CrossRefGoogle Scholar
  22. Ljung L (1999) System identification—theory for the user, 2nd edn. PTR Prentice Hall, Upper Saddle RiverGoogle Scholar
  23. Neelakantan T, Brion GM, Lingireddy S (2001) Neural network modeling of Cryptosporidium and Giardia concentrations in the Delaware river. Water Sci Technol 43(12):125–132Google Scholar
  24. O’Neill M, Ryan C (2001) Grammatical evolution. IEEE Trans Evol Comput 5(4):349–358CrossRefGoogle Scholar
  25. Peeters L, Bac F, Lobo V, Dassargues A (2007) Exploratory data analysis and clustering of multivariate spatial hydrogeological data by means of GEO3DSOM, a variant of Kohonen’s self organizing map. Hydrol Earth Syst Sci 11:1309–1321CrossRefGoogle Scholar
  26. Powell MJD (1964) An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput J 7:152–162CrossRefGoogle Scholar
  27. Sadiq R, Najjaran H, Kleiner Y (2006) Investigating evidential reasoning for the interpretation of microbial water quality in a distribution network. Stoch Environ Res Risk Assess 21:63–73CrossRefGoogle Scholar
  28. Sanchez-Martos F, Aguilera PA, Garrido-Frenich A, Torres JA, Pulido-Bosch A (2002) Assessment of groundwater quality by means of self-organizing maps: application in a semiarid area. Environ Manage 30:716–726CrossRefGoogle Scholar
  29. Scarlatos PD (2001) Computer modeling of fecal coliform contamination of an urban estuarine system. Water Sci Technol 4(7):9–16Google Scholar
  30. Suh C-W, Lee J-W, Hong Y-ST, Shin H-S (2009) Sequential modeling of fecal coliform removals in a full-scale activated-sludge wastewater treatment plant using an evolutionary process model induction system. Water Res 43/1:137–147. doi: 10.1016/j.watres.2008.09.022 CrossRefGoogle Scholar
  31. Tian Y, Gong P, Radke J, Scarborough J (2002) Spatial and temporal modeling of microbial contaminants on grazing farmland. J Environ Qual 31:860–869CrossRefGoogle Scholar
  32. Vesanto J (1999) SOM-based data visualization methods. Intell Data Anal 3:111–126CrossRefGoogle Scholar
  33. Whigham PA (1995) Inductive bias and genetic programming. In: Zalzala AMA (ed) First international conference on genetic algorithms in engineering systems: innovation and application, vol 414, pp 461–466. GALESIA, The Institute of Electrical Engineer, UKGoogle Scholar
  34. Wilkinson J, Jenkins A, Wyer M, Kay D (1995) Modelling faecal coliform dynamics in streams and rivers. Water Res 29:847–855CrossRefGoogle Scholar
  35. SomToolbox, Helsinky University of Technology Available:

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Department of Urban EngineeringLondon South Bank UniversityLondonUK
  2. 2.Department of Civil and Environmental EngineeringChonnam National UniversityYosu-siRepublic of Korea

Personalised recommendations