Skip to main content

Statistical Approaches for Modeling in Microbial Source Tracking

  • Chapter
  • First Online:
Microbial Source Tracking: Methods, Applications, and Case Studies

Abstract

Microbial source tracking (MST) concerns the definition of new indicators and appropriate detection methods, the identification of host-specific indicators of fecal pollution, and ultimately the development of useful and reliable predictive models for practical deployment. Optimal predictive models should be designed using proper statistical and computational tools for the analysis of the available data samples. A further requirement is found in the determination of appropriate sets of predictors (indicators, tracers) for developing accurate and low-cost MST solutions. This chapter briefly reviews some of these modeling tools, and their use and feasibility in providing more accurate MST-based results. It also evaluates the potential of established and new algorithmic methods to the identification of fecal pollution sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Allsop K, Stickler DJ (1985) An assessment of Bacteroides fragilis group organisms as indicators of human faecal pollution. Journal of Applied Bacteriology 58:95–99.

    Article  PubMed  CAS  Google Scholar 

  • Belanche L, Blanch AR (2008) Machine learning methods for microbial source tracking. Environmental Modeling & Software 23:741–750.

    Article  Google Scholar 

  • Bell A, Layton AC, McKay L et al (2009) Factors influencing the persistence of fecal Bacteroides in stream water. J Environ Qual. 38:1224.

    Article  PubMed  CAS  Google Scholar 

  • Bishop C (1995) Neural Networks for Pattern Recognition. Clarendon Press, New York.

    Google Scholar 

  • Blanch AR, Belanche-Munoz L, Bonjoch X et al (2004) Tracking the origin of faecal pollution in surface water: an ongoing project within the European Union research programme. J. Water Health 2:249–260.

    PubMed  Google Scholar 

  • Blanch AR, Belanche-Munoz L, Bonjoch X et al (2006) Integrated analysis of established and novel microbial and chemical methods for microbial source tracking. Appl Environ Microbiol 72:5915–5926.

    Article  PubMed  CAS  Google Scholar 

  • Blanch AR, Lucena F, Payan A et al (2008) Minimal requirements for parameters to be used in the development of predictive models for microbial source tracking: somatic coliphages and phages infecting Bacteroides as examples. Journal of Environmental Detection 1:2–21.

    Google Scholar 

  • Bonjoch X, Lucena F, Blanch AR (2009) The persistence of bifidobacteria populations in a river measured by molecular and culture techniques. J Appl Microbiol 107:1178–1185.

    Article  PubMed  CAS  Google Scholar 

  • Breiman L (1996) Bagging predictors. Machine Learning 24:123–140.

    Google Scholar 

  • Breiman L (2001) Random Forests. Machine Learning 45:5–32.

    Article  Google Scholar 

  • Brion GM, Lindgireddy S (2000) Identification of pollution sources via neural networks, p. 179-197. In: R. S. Govindaraju and Ramachandra Rao (eds.) Artificial Neural Networks in Hydrology. Kluwer Academic Publishers. The Netherlands.

    Google Scholar 

  • Brion GM, Lingireddy S (2003) Artificial neural network modeling: a summary of successful applications relative to microbial water quality. Water Sci Technol. 47:235–40.

    PubMed  CAS  Google Scholar 

  • Brion GM, Neelakantan TR, Lingireddy S (2002) A neural-network-based classification scheme for sorting sources and ages of fecal contamination in water. Water Res. 36:3765–3774.

    Article  PubMed  CAS  Google Scholar 

  • Burges CJC (1998) A Tutorial on Support Vector Machines for Pattern Recognition. Knowledge Discovery and Data Mining 2:121–167

    Article  Google Scholar 

  • Carlin BP, Thomas AL (2008) Bayesian methods for data analysis. Third Edition. Chapman & Hall/CRC, Boca Raton, p. 552.

    Google Scholar 

  • Christianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press. Cambridge.

    Google Scholar 

  • Dietterich TG (2000) Ensemble methods in machine learning. Multiple Classifier SystemsSpringer, Heidelberg. LNCS 1857:1–15.

    Google Scholar 

  • Domingos PA (2000) Unified bias-variance decomposition and its applications. Proceedings of the International Conference on Machine Learning Stanford, CA:231–238.

    Google Scholar 

  • Duda RO, Hart PE, Stork L (2001) Pattern classification. Wiley, New York. pp. 654.

    Google Scholar 

  • Efron B, Tibshirani R (1997) Improvements on cross-validation: The.632+ Bootstrap Method. Journal of the American Statistical Association 92:548–560.

    Google Scholar 

  • Field KG, Chern EC, Dick LK et al (2003) A comparative study of culture-independent, library-independent genotypic methods of faecal source tracking. Journal of Water and Health 1:181–194.

    PubMed  Google Scholar 

  • Fisdal L, Makai JS, La Croix SJ et al (1985) Survival and detection of Bacteroides spp. prospective indicator bacteria. Applied and Environmental Microbiology 49:148–150.

    Google Scholar 

  • Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings International Conference on Machine Learning, Morgan Kaufmann: 148–146.

    Google Scholar 

  • Geldreich EE (1976) Faecal coliforms and faecal streptococcus relationship in waste discharge and receiving waters. Critical Reviews in Environmental Control 6:349–368.

    Article  Google Scholar 

  • Griffith JF, Weisbert SB, McGee CD (2003) Evaluation of microbial source tracking methods using mixed faecal sources in aqueous test samples. Journal of Water and Health 1:141–151.

    PubMed  Google Scholar 

  • Gronewold AD, Qian S, Wolpert RL et al (2009) Calibrating and validating bacterial water quality models: A Bayesian approach. Water Res. 43:2688–2698.

    Article  PubMed  CAS  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J. Mach. Learn. Res. 3:1157–1182.

    Google Scholar 

  • Harwood VJ, Whitlock J, Withington V (2000) Classification of antibiotic resistance patterns of indicator bacteria by discriminant analysis: use in predicting the source of faecal contamination in subtropical waters. Applied and Environmental Microbiology, 66:3698–3704.

    Article  PubMed  CAS  Google Scholar 

  • Harwood VJ, Wiggins B, Hagedorn C et al (2003) Phenotypic library-based microbial source tracking methods: efficacy in the California collaborative study. Journal of Water and Health 1:153–166.

    PubMed  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2001) The Elements of Statistical Learning. Springer-Verlag, New York. p. 533.

    Article  PubMed  CAS  Google Scholar 

  • Haykin S (1994) Neural Networks: A Comprehensive Foundation. MacMillan, New York, pp.696.

    Google Scholar 

  • Hecht-Nielsen R (1990) Neurocomputing. Addison-Wesley. Redwood City.

    Google Scholar 

  • Hertz J, Krogh A, Palmer RG (1991) Introduction to the Theory of Neural Computation, Addison-Wesley. Redwood City

    Google Scholar 

  • Hocking RR (1976) The Analysis and Selection of Variables in Linear Regression. Biometrics 32:1–49.

    Article  Google Scholar 

  • Kira K, Rendell LA (1992) The feature selection problem: Traditional methods and a new algorithm, p. 129-134. In: AAAI-92, Proceedings of the Ninth National Conference on Artificial Intelligence, AAAI Press.

    Google Scholar 

  • Kosko B (1993) Fuzzy thinking: the new science of fuzzy logic. Hyperion, NY.

    Google Scholar 

  • Lasalde C, RodrĂ­guez R, Toranzos GA (2005) Statistical analyses: possible reasons for unreliability of source tracking efforts. Appl Environ Microbiol 71:4690–4695.

    Article  PubMed  CAS  Google Scholar 

  • Layton A, McKay L, Williams D et al (2006) Development of Bacteroides 16S rRNA gene TaqMan-based real-time PCR assays for estimation of total, human, and bovine faecal pollution in water. Appl Environ Microbiol 72:4214–4224.

    Article  PubMed  CAS  Google Scholar 

  • Leeming R, Ball A, Ashbolt N et al (1996) Using faecal sterols from humans and animals to distinguish faecal pollution in receiving waters. Water Research, 30:2893–2900.

    Article  CAS  Google Scholar 

  • Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. The Springer International Series in Engineering and Computer Science, Vol.454, Springer, New York, pp. 244.

    Google Scholar 

  • Malakoff D (2002) Microbiologists in the trail of polluting bacteria. Science 295:2352–2353.

    Article  PubMed  CAS  Google Scholar 

  • Mara DD, Oragui JI (1981) Occurrence of Rhodococcus coprophilus and associated actinomycetes in feces, sewage and freshwater. Appl Environ Microbiol 51:85–93.

    Article  Google Scholar 

  • Martellini A, Payment P, Villemur R (2005) Use of eukaryotic mitochondrial DNA to differentiate human, bovine, porcine and ovine sources in fecally contaminated surface water. Water Research 39:541–548.

    Article  PubMed  CAS  Google Scholar 

  • McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall, London, Second Edition.

    Google Scholar 

  • Mitchell M (1997) Machine Learning. McGraw-Hill Higher Education, NY.

    Google Scholar 

  • Myoda SP, Carson CA, Fuhrmann JJ et al (2003) Comparison of genotypic-based microbial source tracking methods requiring a host origin database. J. Wat. Health 1:167–180.

    Google Scholar 

  • Neelakantan TR, Brion GM, Lingireddy S (2001) Neural network modeling of Cryptosporidium and Giardia concentrations in the Delaware River, USA. Water Sci Technol. 43:125–132.

    PubMed  CAS  Google Scholar 

  • Nieman J, Brion GM (2003) Novel bacterial ratio for predicting fecal age. Water Sci Technol. 47:45–49.

    PubMed  CAS  Google Scholar 

  • Noble RT, Allen SM, Blackwood AD et al (2003) Use of viral pathogens and indicators to differentiate between human and non-human faecal contamination in a microbial source tracking comparison study. Journal of Water and Health 1:195–209.

    PubMed  Google Scholar 

  • Osawa S, Furuse K, Watanabe I (1981) Distribution of ribonucleic acid coliphages in animals. Appl Environ Microbiol 41:164–168.

    PubMed  CAS  Google Scholar 

  • Paliwal M, Kumar UA (2009) Neural networks and statistical techniques: a review of applications. Expert Systems with Applications 36:2–17.

    Article  Google Scholar 

  • Parveen S, Portier KM, Robinson K et al (1999) Discriminant analysis of ribotype profiles of Escherichia coli for differentiating human and nonhuman sources of fecal pollution. Appl Environ Microbiol 65:3142–3147.

    PubMed  CAS  Google Scholar 

  • Payán A, Ebdon J, Taylor H et al (2005) Method for Isolation of Bacteroides bacteriophages host strains suitable for tracking sources of faecal pollution in water. Appl Environ Microbiol 71: 5659–5662.

    Article  PubMed  Google Scholar 

  • Pote J, Haller L, Kottelat R et al (2009) Persistence and growth of fecal culturable bacterial indicators in water column and sediments of Vidy Bay, Lake Geneva, Switzerland. J Environ Sci 21: 62–69.

    Article  CAS  Google Scholar 

  • Pudil P, Novovicová J, Kittler J (1994) Floating search methods in feature selection Pattern Recognition Letters 15:1119–1125.

    Google Scholar 

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., Los Altos, pp. 302.

    Google Scholar 

  • Reischer GH, Kasper DC, Steinborn R et al (2007) A quantitative real-time PCR assay for the highly sensitive and specific detection of human faecal influence in spring water from a large alpine catchment area. Letters in Applied Microbiology 44:351–356.

    Article  PubMed  CAS  Google Scholar 

  • Ripley BD (1996) Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, pp. 403.

    Google Scholar 

  • Ritter KJ, Carruthers E, Carson CA et al (2003) Assessment of statistical methods used in library-based approaches to microbial source tracking. Journal of Water and Health 1:209–223.

    PubMed  Google Scholar 

  • Santo Domingo JW, Bambic DG, Edge TA et al (2007) Quo vadis source tracking? Towards a strategic framework for environmental monitoring of fecal pollution. Water Res. 41:3539–3552.

    Article  PubMed  CAS  Google Scholar 

  • Scott TM, Jenkins TM, Lukasik J et al (2005) Potential use of a host associated molecular marker in Enterococcus faecium as an index of human faecal pollution. Environmental Science and Technology, 39:283–287.

    Article  PubMed  CAS  Google Scholar 

  • Simpson JM, Santo Domingo JW, Reasoner DJ (2002) Microbial source tracking: state of the science. Environmental Science and Technology 36:5279–5288.

    Article  PubMed  CAS  Google Scholar 

  • Stearns SD (1976) On selecting features for pattern classifiers. Proceedings of the International Conference on Pattern Recognition (ICPR 1976), Coronado.

    Google Scholar 

  • Stewart JR, Ellender RD, Gooch JA et al (2003) Recommendations for microbial source tracking: lessons from a methods comparison study. Journal of Water and Health 1:225–231.

    PubMed  Google Scholar 

  • Stoeckel DM, Harwood VJ (2007) Performance, design, and analysis in microbial source tracking studies. Appl Environ Microbiol 73:2405–2415.

    Article  PubMed  CAS  Google Scholar 

  • Vapnik V (1998) Statistical Learning Theory. Wiley, New York.

    Google Scholar 

  • Wiggins BA (1996) Discriminant analysis of antibiotic resistance patterns in faecal streptococci, a method to differentiate human and animal sources of faecal pollution in natural waters. Appl Environ Microbiol 62:3997–4002.

    PubMed  CAS  Google Scholar 

  • Zhou F, Guo HC, Liu Y, et al (2007) Identification and spatial patterns of coastal water pollution sources based on GIS and chemometric approach. J Environ Sci 19:805–810.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anicet R. Blanch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Belanche, L.A., Blanch, A.R. (2011). Statistical Approaches for Modeling in Microbial Source Tracking. In: Hagedorn, C., Blanch, A., Harwood, V. (eds) Microbial Source Tracking: Methods, Applications, and Case Studies. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9386-1_9

Download citation

Publish with us

Policies and ethics