Abstract
Microbial source tracking (MST) concerns the definition of new indicators and appropriate detection methods, the identification of host-specific indicators of fecal pollution, and ultimately the development of useful and reliable predictive models for practical deployment. Optimal predictive models should be designed using proper statistical and computational tools for the analysis of the available data samples. A further requirement is found in the determination of appropriate sets of predictors (indicators, tracers) for developing accurate and low-cost MST solutions. This chapter briefly reviews some of these modeling tools, and their use and feasibility in providing more accurate MST-based results. It also evaluates the potential of established and new algorithmic methods to the identification of fecal pollution sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allsop K, Stickler DJ (1985) An assessment of Bacteroides fragilis group organisms as indicators of human faecal pollution. Journal of Applied Bacteriology 58:95–99.
Belanche L, Blanch AR (2008) Machine learning methods for microbial source tracking. Environmental Modeling & Software 23:741–750.
Bell A, Layton AC, McKay L et al (2009) Factors influencing the persistence of fecal Bacteroides in stream water. J Environ Qual. 38:1224.
Bishop C (1995) Neural Networks for Pattern Recognition. Clarendon Press, New York.
Blanch AR, Belanche-Munoz L, Bonjoch X et al (2004) Tracking the origin of faecal pollution in surface water: an ongoing project within the European Union research programme. J. Water Health 2:249–260.
Blanch AR, Belanche-Munoz L, Bonjoch X et al (2006) Integrated analysis of established and novel microbial and chemical methods for microbial source tracking. Appl Environ Microbiol 72:5915–5926.
Blanch AR, Lucena F, Payan A et al (2008) Minimal requirements for parameters to be used in the development of predictive models for microbial source tracking: somatic coliphages and phages infecting Bacteroides as examples. Journal of Environmental Detection 1:2–21.
Bonjoch X, Lucena F, Blanch AR (2009) The persistence of bifidobacteria populations in a river measured by molecular and culture techniques. J Appl Microbiol 107:1178–1185.
Breiman L (1996) Bagging predictors. Machine Learning 24:123–140.
Breiman L (2001) Random Forests. Machine Learning 45:5–32.
Brion GM, Lindgireddy S (2000) Identification of pollution sources via neural networks, p. 179-197. In: R. S. Govindaraju and Ramachandra Rao (eds.) Artificial Neural Networks in Hydrology. Kluwer Academic Publishers. The Netherlands.
Brion GM, Lingireddy S (2003) Artificial neural network modeling: a summary of successful applications relative to microbial water quality. Water Sci Technol. 47:235–40.
Brion GM, Neelakantan TR, Lingireddy S (2002) A neural-network-based classification scheme for sorting sources and ages of fecal contamination in water. Water Res. 36:3765–3774.
Burges CJC (1998) A Tutorial on Support Vector Machines for Pattern Recognition. Knowledge Discovery and Data Mining 2:121–167
Carlin BP, Thomas AL (2008) Bayesian methods for data analysis. Third Edition. Chapman & Hall/CRC, Boca Raton, p. 552.
Christianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press. Cambridge.
Dietterich TG (2000) Ensemble methods in machine learning. Multiple Classifier SystemsSpringer, Heidelberg. LNCS 1857:1–15.
Domingos PA (2000) Unified bias-variance decomposition and its applications. Proceedings of the International Conference on Machine Learning Stanford, CA:231–238.
Duda RO, Hart PE, Stork L (2001) Pattern classification. Wiley, New York. pp. 654.
Efron B, Tibshirani R (1997) Improvements on cross-validation: The.632+ Bootstrap Method. Journal of the American Statistical Association 92:548–560.
Field KG, Chern EC, Dick LK et al (2003) A comparative study of culture-independent, library-independent genotypic methods of faecal source tracking. Journal of Water and Health 1:181–194.
Fisdal L, Makai JS, La Croix SJ et al (1985) Survival and detection of Bacteroides spp. prospective indicator bacteria. Applied and Environmental Microbiology 49:148–150.
Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings International Conference on Machine Learning, Morgan Kaufmann: 148–146.
Geldreich EE (1976) Faecal coliforms and faecal streptococcus relationship in waste discharge and receiving waters. Critical Reviews in Environmental Control 6:349–368.
Griffith JF, Weisbert SB, McGee CD (2003) Evaluation of microbial source tracking methods using mixed faecal sources in aqueous test samples. Journal of Water and Health 1:141–151.
Gronewold AD, Qian S, Wolpert RL et al (2009) Calibrating and validating bacterial water quality models: A Bayesian approach. Water Res. 43:2688–2698.
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J. Mach. Learn. Res. 3:1157–1182.
Harwood VJ, Whitlock J, Withington V (2000) Classification of antibiotic resistance patterns of indicator bacteria by discriminant analysis: use in predicting the source of faecal contamination in subtropical waters. Applied and Environmental Microbiology, 66:3698–3704.
Harwood VJ, Wiggins B, Hagedorn C et al (2003) Phenotypic library-based microbial source tracking methods: efficacy in the California collaborative study. Journal of Water and Health 1:153–166.
Hastie T, Tibshirani R, Friedman J (2001) The Elements of Statistical Learning. Springer-Verlag, New York. p. 533.
Haykin S (1994) Neural Networks: A Comprehensive Foundation. MacMillan, New York, pp.696.
Hecht-Nielsen R (1990) Neurocomputing. Addison-Wesley. Redwood City.
Hertz J, Krogh A, Palmer RG (1991) Introduction to the Theory of Neural Computation, Addison-Wesley. Redwood City
Hocking RR (1976) The Analysis and Selection of Variables in Linear Regression. Biometrics 32:1–49.
Kira K, Rendell LA (1992) The feature selection problem: Traditional methods and a new algorithm, p. 129-134. In: AAAI-92, Proceedings of the Ninth National Conference on Artificial Intelligence, AAAI Press.
Kosko B (1993) Fuzzy thinking: the new science of fuzzy logic. Hyperion, NY.
Lasalde C, RodrĂguez R, Toranzos GA (2005) Statistical analyses: possible reasons for unreliability of source tracking efforts. Appl Environ Microbiol 71:4690–4695.
Layton A, McKay L, Williams D et al (2006) Development of Bacteroides 16S rRNA gene TaqMan-based real-time PCR assays for estimation of total, human, and bovine faecal pollution in water. Appl Environ Microbiol 72:4214–4224.
Leeming R, Ball A, Ashbolt N et al (1996) Using faecal sterols from humans and animals to distinguish faecal pollution in receiving waters. Water Research, 30:2893–2900.
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. The Springer International Series in Engineering and Computer Science, Vol.454, Springer, New York, pp. 244.
Malakoff D (2002) Microbiologists in the trail of polluting bacteria. Science 295:2352–2353.
Mara DD, Oragui JI (1981) Occurrence of Rhodococcus coprophilus and associated actinomycetes in feces, sewage and freshwater. Appl Environ Microbiol 51:85–93.
Martellini A, Payment P, Villemur R (2005) Use of eukaryotic mitochondrial DNA to differentiate human, bovine, porcine and ovine sources in fecally contaminated surface water. Water Research 39:541–548.
McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall, London, Second Edition.
Mitchell M (1997) Machine Learning. McGraw-Hill Higher Education, NY.
Myoda SP, Carson CA, Fuhrmann JJ et al (2003) Comparison of genotypic-based microbial source tracking methods requiring a host origin database. J. Wat. Health 1:167–180.
Neelakantan TR, Brion GM, Lingireddy S (2001) Neural network modeling of Cryptosporidium and Giardia concentrations in the Delaware River, USA. Water Sci Technol. 43:125–132.
Nieman J, Brion GM (2003) Novel bacterial ratio for predicting fecal age. Water Sci Technol. 47:45–49.
Noble RT, Allen SM, Blackwood AD et al (2003) Use of viral pathogens and indicators to differentiate between human and non-human faecal contamination in a microbial source tracking comparison study. Journal of Water and Health 1:195–209.
Osawa S, Furuse K, Watanabe I (1981) Distribution of ribonucleic acid coliphages in animals. Appl Environ Microbiol 41:164–168.
Paliwal M, Kumar UA (2009) Neural networks and statistical techniques: a review of applications. Expert Systems with Applications 36:2–17.
Parveen S, Portier KM, Robinson K et al (1999) Discriminant analysis of ribotype profiles of Escherichia coli for differentiating human and nonhuman sources of fecal pollution. Appl Environ Microbiol 65:3142–3147.
Payán A, Ebdon J, Taylor H et al (2005) Method for Isolation of Bacteroides bacteriophages host strains suitable for tracking sources of faecal pollution in water. Appl Environ Microbiol 71: 5659–5662.
Pote J, Haller L, Kottelat R et al (2009) Persistence and growth of fecal culturable bacterial indicators in water column and sediments of Vidy Bay, Lake Geneva, Switzerland. J Environ Sci 21: 62–69.
Pudil P, Novovicová J, Kittler J (1994) Floating search methods in feature selection Pattern Recognition Letters 15:1119–1125.
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., Los Altos, pp. 302.
Reischer GH, Kasper DC, Steinborn R et al (2007) A quantitative real-time PCR assay for the highly sensitive and specific detection of human faecal influence in spring water from a large alpine catchment area. Letters in Applied Microbiology 44:351–356.
Ripley BD (1996) Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, pp. 403.
Ritter KJ, Carruthers E, Carson CA et al (2003) Assessment of statistical methods used in library-based approaches to microbial source tracking. Journal of Water and Health 1:209–223.
Santo Domingo JW, Bambic DG, Edge TA et al (2007) Quo vadis source tracking? Towards a strategic framework for environmental monitoring of fecal pollution. Water Res. 41:3539–3552.
Scott TM, Jenkins TM, Lukasik J et al (2005) Potential use of a host associated molecular marker in Enterococcus faecium as an index of human faecal pollution. Environmental Science and Technology, 39:283–287.
Simpson JM, Santo Domingo JW, Reasoner DJ (2002) Microbial source tracking: state of the science. Environmental Science and Technology 36:5279–5288.
Stearns SD (1976) On selecting features for pattern classifiers. Proceedings of the International Conference on Pattern Recognition (ICPR 1976), Coronado.
Stewart JR, Ellender RD, Gooch JA et al (2003) Recommendations for microbial source tracking: lessons from a methods comparison study. Journal of Water and Health 1:225–231.
Stoeckel DM, Harwood VJ (2007) Performance, design, and analysis in microbial source tracking studies. Appl Environ Microbiol 73:2405–2415.
Vapnik V (1998) Statistical Learning Theory. Wiley, New York.
Wiggins BA (1996) Discriminant analysis of antibiotic resistance patterns in faecal streptococci, a method to differentiate human and animal sources of faecal pollution in natural waters. Appl Environ Microbiol 62:3997–4002.
Zhou F, Guo HC, Liu Y, et al (2007) Identification and spatial patterns of coastal water pollution sources based on GIS and chemometric approach. J Environ Sci 19:805–810.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Belanche, L.A., Blanch, A.R. (2011). Statistical Approaches for Modeling in Microbial Source Tracking. In: Hagedorn, C., Blanch, A., Harwood, V. (eds) Microbial Source Tracking: Methods, Applications, and Case Studies. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9386-1_9
Download citation
DOI: https://doi.org/10.1007/978-1-4419-9386-1_9
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9385-4
Online ISBN: 978-1-4419-9386-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)