Skip to main content
Log in

Temporal Synchronization Analysis for Improving Regression Modeling of Fecal Indicator Bacteria Levels

  • Published:
Water, Air, & Soil Pollution Aims and scope Submit manuscript

Abstract

Multiple linear regression models are often used to predict levels of fecal indicator bacteria (FIB) in recreational swimming waters based on independent variables (IVs) such as meteorologic, hydrodynamic, and water-quality measures. The IVs used for these analyses are traditionally measured at the same time as the water-quality sample. We investigated the improvement in empirical modeling performance by using IVs that had been temporally synchronized with the FIB response variable. We first examined the univariate relationship between multiple “aspects” of each IV and the response variable to find the single aspect of each IV most strongly related to the response. Aspects are defined by the temporal window and lag (relative to when the response is measured) over which the IV is averaged. Models were then formed using the “best” aspects of each IV. Employing iterative cross-validation, we examined the average improvement in the mean squared error of prediction, MSEP, for a testing dataset after using our temporal synchronization technique on the training data. We compared the MSEP values of three methodologies: predictions made using unsynchronized IVs (UNS), predictions made using synchronized IVs where aspects were chosen using a Pearson correlation coefficient (PCC), and predictions using IV aspects chosen using the PRESS statistic (PRS). Averaging over 500 randomly generated testing datasets, the MSEP values using the PRS technique were 50 % lower (p < 0.001) than the MSEP values of the UNS technique. The average MSEP values of the PCC technique were 26 % lower (p < 0.001) than the MSEP values of the UNS technique. We conclude that temporal synchronization is capable of significantly improving predictive models of FIB levels in recreational swimming waters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.

    Article  Google Scholar 

  • Benham, B. L., Baffaut, C., Zeckoski, R. W., Mankin, K. R., Pachepsky, Y. A., Sadeghi, A. M., Brannan, K. M., Soupir, M. L., & Habersack, M. J. (2006). Modeling bacteria fate and transport in watersheds to support TMDLs. Transactions of the ASABE, 49, 987–1002.

    Google Scholar 

  • Boehm, A. B., Grant, S. B., Kim, J. H., Mowbay, S. L., McGee, C. D., Clark, C. D., Foley, D. M., & Wellman, D. E. (2002). Decadal and shorter period variability of surf zone water quality at Huntington Beach, California. Environmental Science and Technology, 36, 3885–3892.

    Article  CAS  Google Scholar 

  • Boehm, A. B., Whitman, R. L., Nevers, M. B., Hou, D., & Weisberg, S. B. (2007). Now-casting recreational water quality. In L. Wymer & A. Dufour (Eds.), Statistical Framework for Water Quality Criteria and Monitoring. West Sussex: Wiley and Sons.

    Google Scholar 

  • Borsuk, M. E., Stow, C. A., & Reckhow, K. H. (2002). Predicting the frequency of water quality standard violations: a probabilistic approach for TMDL development. Environmental Science and Technology, 36, 2109–2115.

    Article  CAS  Google Scholar 

  • Cromwell, J., Hannan, M., Labys, W., & Terraza, M. (1994). Multivariate tests for time series models. Quantitative Applications in the Social Sciences Series, No. 100. Thousand Oaks: Sage Publications, Inc.

    Google Scholar 

  • Draper, N., & Smith, H. (1981). Applied regression analysis. New York: Wiley.

    Google Scholar 

  • Francy, D. S. & Darner, R. A. (2006). Procedures for Developing Models to Predict Exceedances of Recreational Water-Quality Standards at Coastal Beaches. USGS Techniques and Methods Report 6-B5.

  • Frick, W. E., Ge, Z., & Zepp, R. G. (2008). Nowcasting and forecasting concentrations of biological contaminants at beaches: a feasibility and case study. Environmental Science and Technology, 42, 4818–4824.

    Article  CAS  Google Scholar 

  • Ge, Z., & Frick, W. E. (2007). Some statistical issues related to multiple linear regression modeling of beach bacteria concentrations. Environmental Research, 103, 358–364.

    Article  CAS  Google Scholar 

  • Ge, Z., & Frick, W. E. (2009). Time-frequency analysis of beach bacteria variations and its implication for recreational water quality modeling. Environmental Science and Technology, 43, 1128–1133.

    Article  CAS  Google Scholar 

  • Granger, C. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37, 424–438.

    Article  Google Scholar 

  • Grant, S., & Sanders, B. (2010). Beach boundary layer: a framework for addressing recreational water quality impairment at enclosed beaches. Environmental Science and Technology, 44(23), 8804–8813.

    Article  CAS  Google Scholar 

  • Griffith, J. F. & Weisberg, S. B. (2009). Evaluation of Rapid Microbiological Methods for Measuring Recreational Water Quality. Southern California Coastal Water Research Project Technical Report 485.

  • Gronewold, A. D., Qian, S. S., Wolpert, R. L., & Reckhow, K. H. (2009). Calibrating and validating bacterial water quality models: a Bayesian approach. Water Research, 43, 2688–2698.

    Article  CAS  Google Scholar 

  • Holtschlag, D. J., Shively, D., Whitman, R. L., Haack, S. K. & Fogarty, L.R. (2008). Environmental factors and flow paths related to Escherichia coli concentrations at two beaches on Lake St. Clair, Michigan, 2002–2005. USGS Scientific Investigations Report 2008–5028. Reston, VA.

  • Hose, G., Gordon, G., McCullough, F., Pulver, N., & Murray, B. (2005). Spatial and rainfall related patterns of bacterial contamination in Sydney Harbour estuary. Journal of Water and Health, 3, 349–358.

    Google Scholar 

  • Hou, D., Rabinovici, S., & Boehm, A. B. (2006). Enterococci predictions from a partial least squares regression model can improve the efficacy of beach management advisories. Environmental Science and Technology, 40, 1737–1743.

    Article  CAS  Google Scholar 

  • Jenkins, G., & Watts, D. (1968). Spectral analysis and its applications. San Francisco: Holden-Day.

    Google Scholar 

  • Lee, C., Griffith, J., Kaiser, W., & Jay, J. (2010). Covalently linked immunomagnetic separation/adenosine triphosphate technique (Cov-IMS/ATP) enables rapid, in-field detection and quantification of Escherichia coli and Enterococcus spp. in freshwater and marine environments. Journal of Applied Microbiology, 109, 324–333.

    CAS  Google Scholar 

  • Lehmann, E., & Casella, G. (1998). Theory of point estimation. New York: Springer.

    Google Scholar 

  • McLellan, S. L., & Salmore, A. K. (2003). Evidence for localized bacterial loading as the cause of chronic beach closings in a freshwater marina. Water Research, 37(11), 2700–2708.

    Article  CAS  Google Scholar 

  • MMSD (Milwaukee Metropolitan Sewerage District). (2005). Bacteria source, transport and fate study—phase 1, Milwaukee Harbor Estuary hydrodynamic and bacteria modeling. Milwaukee: Milwaukee Metropolitan Sewerage District.

    Google Scholar 

  • Neumann, C. M., Harding, A. K., & Sherman, J. M. (2006). Oregon beach monitoring program: bacterial exceedances in marine and freshwater creeks/outfall samples, October 2002–April 2005. Marine Pollution Bulletin, 52, 1270–1277.

    Article  CAS  Google Scholar 

  • Nevers, M. B., & Whitman, R. L. (2005). Nowcast modeling of Escherichia coli concentrations at multiple urban beaches of southern Lake Michigan. Water Research, 39, 5250–5650.

    Article  CAS  Google Scholar 

  • Nevers, M. B., Whitman, R. L., Frick, W. E., & Ge, Z. (2007). Interaction and influence of two creeks on Escherichia coli concentrations of nearby beaches: Exploration of predictability and mechanisms. Journal of Environmental Quality, 36, 1338–1345.

    Article  CAS  Google Scholar 

  • Nevers, M. B., Shively, D., Kleinheinz, G., McDermott, C., Schuster, W., Chomeau, V., & Whitman, R. (2009). Geographic relatedness and predictability of Escherichia coli along a peninsular beach complex of Lake Michigan. Journal of Environmental Quality, 38, 2357–2364.

    Article  CAS  Google Scholar 

  • Olyphant, G. A. (2005). Statistical basis for predicting the need for bacterially induced beach closures: emergence of a paradigm? Water Research, 39, 4953–4960.

    Article  CAS  Google Scholar 

  • Olyphant, G. A., & Whitman, R. (2004). Elements of a predictive model for determining beach closures in a real-time basis: the case of 63rd Street Beach Chicago. Environmental Monitoring and Assessment, 98, 175–190.

    Article  Google Scholar 

  • Olyphant, G. A., Thomas, J., Whitman, R. L., & Harper, D. (2003). Characterization and statistical modeling of bacterial (Escherichia coli) outflows from watersheds that discharge into southern Lake Michigan. Environmental Monitoring and Assessment, 81, 289–300.

    Article  Google Scholar 

  • R Development Core Team. (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

  • Scopel, C. O., Harris, J., & McLellan, S. L. (2006). Influence of nearshore water dynamics and pollution sources on beach monitoring outcomes at two adjacent Lake Michigan beaches. Journal of Great Lakes Research, 32(3), 543–552.

    Article  Google Scholar 

  • Telech, J. W., Brenner, K. P., Haugland, R. A., Sams, E. A., Dufour, A. P., Wymer, L. J., & Wade, T. J. (2009). Modeling Enterococcus densities measured by quantitative polymerase chain reaction and membrane filtration using environmental conditions at four Great Lakes beaches. Water Research, 43, 4947–4955.

    Article  CAS  Google Scholar 

  • USEPA (Unites States Environmental Protection Agency). (2006). Method 1600: Enterococci in water by membrane filtration using membrane-enterococcus indoxyl-β- d -glucoside agar (mEI). EPA-821-R-06-009. Washington: USEPA.

    Google Scholar 

  • Whitman, R., & Nevers, M. B. (2005). Regional and local factors affecting patterns of E. coli distribution in southern Lake Michigan. Porter: U.S. Geological Survey.

    Google Scholar 

  • Yan, H. (1993). Skew correction of document images using interline cross-correlation. Computer Vision, Graphics, and Image Processing, 55(6), 538–543.

    Google Scholar 

Download references

Acknowledgments

The authors wish to thank Michael Tryby (USEPA) and anonymous referees for insightful review comments. We acknowledge the support of the City of Milwaukee Health Department, South Shore Yacht Club, and Daniel Feinstein. This paper has been reviewed in accordance with the U.S. Environmental Protection Agency's peer and administrative review policies and approved for publication. Mention of trade names or commercial products does not constitute endorsement or recommendation for use.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Cyterski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cyterski, M., Zhang, S., White, E. et al. Temporal Synchronization Analysis for Improving Regression Modeling of Fecal Indicator Bacteria Levels. Water Air Soil Pollut 223, 4841–4851 (2012). https://doi.org/10.1007/s11270-012-1240-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11270-012-1240-3

Keywords

Navigation