Abstract
Multiple linear regression models are often used to predict levels of fecal indicator bacteria (FIB) in recreational swimming waters based on independent variables (IVs) such as meteorologic, hydrodynamic, and water-quality measures. The IVs used for these analyses are traditionally measured at the same time as the water-quality sample. We investigated the improvement in empirical modeling performance by using IVs that had been temporally synchronized with the FIB response variable. We first examined the univariate relationship between multiple “aspects” of each IV and the response variable to find the single aspect of each IV most strongly related to the response. Aspects are defined by the temporal window and lag (relative to when the response is measured) over which the IV is averaged. Models were then formed using the “best” aspects of each IV. Employing iterative cross-validation, we examined the average improvement in the mean squared error of prediction, MSEP, for a testing dataset after using our temporal synchronization technique on the training data. We compared the MSEP values of three methodologies: predictions made using unsynchronized IVs (UNS), predictions made using synchronized IVs where aspects were chosen using a Pearson correlation coefficient (PCC), and predictions using IV aspects chosen using the PRESS statistic (PRS). Averaging over 500 randomly generated testing datasets, the MSEP values using the PRS technique were 50 % lower (p < 0.001) than the MSEP values of the UNS technique. The average MSEP values of the PCC technique were 26 % lower (p < 0.001) than the MSEP values of the UNS technique. We conclude that temporal synchronization is capable of significantly improving predictive models of FIB levels in recreational swimming waters.
Similar content being viewed by others
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.
Benham, B. L., Baffaut, C., Zeckoski, R. W., Mankin, K. R., Pachepsky, Y. A., Sadeghi, A. M., Brannan, K. M., Soupir, M. L., & Habersack, M. J. (2006). Modeling bacteria fate and transport in watersheds to support TMDLs. Transactions of the ASABE, 49, 987–1002.
Boehm, A. B., Grant, S. B., Kim, J. H., Mowbay, S. L., McGee, C. D., Clark, C. D., Foley, D. M., & Wellman, D. E. (2002). Decadal and shorter period variability of surf zone water quality at Huntington Beach, California. Environmental Science and Technology, 36, 3885–3892.
Boehm, A. B., Whitman, R. L., Nevers, M. B., Hou, D., & Weisberg, S. B. (2007). Now-casting recreational water quality. In L. Wymer & A. Dufour (Eds.), Statistical Framework for Water Quality Criteria and Monitoring. West Sussex: Wiley and Sons.
Borsuk, M. E., Stow, C. A., & Reckhow, K. H. (2002). Predicting the frequency of water quality standard violations: a probabilistic approach for TMDL development. Environmental Science and Technology, 36, 2109–2115.
Cromwell, J., Hannan, M., Labys, W., & Terraza, M. (1994). Multivariate tests for time series models. Quantitative Applications in the Social Sciences Series, No. 100. Thousand Oaks: Sage Publications, Inc.
Draper, N., & Smith, H. (1981). Applied regression analysis. New York: Wiley.
Francy, D. S. & Darner, R. A. (2006). Procedures for Developing Models to Predict Exceedances of Recreational Water-Quality Standards at Coastal Beaches. USGS Techniques and Methods Report 6-B5.
Frick, W. E., Ge, Z., & Zepp, R. G. (2008). Nowcasting and forecasting concentrations of biological contaminants at beaches: a feasibility and case study. Environmental Science and Technology, 42, 4818–4824.
Ge, Z., & Frick, W. E. (2007). Some statistical issues related to multiple linear regression modeling of beach bacteria concentrations. Environmental Research, 103, 358–364.
Ge, Z., & Frick, W. E. (2009). Time-frequency analysis of beach bacteria variations and its implication for recreational water quality modeling. Environmental Science and Technology, 43, 1128–1133.
Granger, C. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37, 424–438.
Grant, S., & Sanders, B. (2010). Beach boundary layer: a framework for addressing recreational water quality impairment at enclosed beaches. Environmental Science and Technology, 44(23), 8804–8813.
Griffith, J. F. & Weisberg, S. B. (2009). Evaluation of Rapid Microbiological Methods for Measuring Recreational Water Quality. Southern California Coastal Water Research Project Technical Report 485.
Gronewold, A. D., Qian, S. S., Wolpert, R. L., & Reckhow, K. H. (2009). Calibrating and validating bacterial water quality models: a Bayesian approach. Water Research, 43, 2688–2698.
Holtschlag, D. J., Shively, D., Whitman, R. L., Haack, S. K. & Fogarty, L.R. (2008). Environmental factors and flow paths related to Escherichia coli concentrations at two beaches on Lake St. Clair, Michigan, 2002–2005. USGS Scientific Investigations Report 2008–5028. Reston, VA.
Hose, G., Gordon, G., McCullough, F., Pulver, N., & Murray, B. (2005). Spatial and rainfall related patterns of bacterial contamination in Sydney Harbour estuary. Journal of Water and Health, 3, 349–358.
Hou, D., Rabinovici, S., & Boehm, A. B. (2006). Enterococci predictions from a partial least squares regression model can improve the efficacy of beach management advisories. Environmental Science and Technology, 40, 1737–1743.
Jenkins, G., & Watts, D. (1968). Spectral analysis and its applications. San Francisco: Holden-Day.
Lee, C., Griffith, J., Kaiser, W., & Jay, J. (2010). Covalently linked immunomagnetic separation/adenosine triphosphate technique (Cov-IMS/ATP) enables rapid, in-field detection and quantification of Escherichia coli and Enterococcus spp. in freshwater and marine environments. Journal of Applied Microbiology, 109, 324–333.
Lehmann, E., & Casella, G. (1998). Theory of point estimation. New York: Springer.
McLellan, S. L., & Salmore, A. K. (2003). Evidence for localized bacterial loading as the cause of chronic beach closings in a freshwater marina. Water Research, 37(11), 2700–2708.
MMSD (Milwaukee Metropolitan Sewerage District). (2005). Bacteria source, transport and fate study—phase 1, Milwaukee Harbor Estuary hydrodynamic and bacteria modeling. Milwaukee: Milwaukee Metropolitan Sewerage District.
Neumann, C. M., Harding, A. K., & Sherman, J. M. (2006). Oregon beach monitoring program: bacterial exceedances in marine and freshwater creeks/outfall samples, October 2002–April 2005. Marine Pollution Bulletin, 52, 1270–1277.
Nevers, M. B., & Whitman, R. L. (2005). Nowcast modeling of Escherichia coli concentrations at multiple urban beaches of southern Lake Michigan. Water Research, 39, 5250–5650.
Nevers, M. B., Whitman, R. L., Frick, W. E., & Ge, Z. (2007). Interaction and influence of two creeks on Escherichia coli concentrations of nearby beaches: Exploration of predictability and mechanisms. Journal of Environmental Quality, 36, 1338–1345.
Nevers, M. B., Shively, D., Kleinheinz, G., McDermott, C., Schuster, W., Chomeau, V., & Whitman, R. (2009). Geographic relatedness and predictability of Escherichia coli along a peninsular beach complex of Lake Michigan. Journal of Environmental Quality, 38, 2357–2364.
Olyphant, G. A. (2005). Statistical basis for predicting the need for bacterially induced beach closures: emergence of a paradigm? Water Research, 39, 4953–4960.
Olyphant, G. A., & Whitman, R. (2004). Elements of a predictive model for determining beach closures in a real-time basis: the case of 63rd Street Beach Chicago. Environmental Monitoring and Assessment, 98, 175–190.
Olyphant, G. A., Thomas, J., Whitman, R. L., & Harper, D. (2003). Characterization and statistical modeling of bacterial (Escherichia coli) outflows from watersheds that discharge into southern Lake Michigan. Environmental Monitoring and Assessment, 81, 289–300.
R Development Core Team. (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
Scopel, C. O., Harris, J., & McLellan, S. L. (2006). Influence of nearshore water dynamics and pollution sources on beach monitoring outcomes at two adjacent Lake Michigan beaches. Journal of Great Lakes Research, 32(3), 543–552.
Telech, J. W., Brenner, K. P., Haugland, R. A., Sams, E. A., Dufour, A. P., Wymer, L. J., & Wade, T. J. (2009). Modeling Enterococcus densities measured by quantitative polymerase chain reaction and membrane filtration using environmental conditions at four Great Lakes beaches. Water Research, 43, 4947–4955.
USEPA (Unites States Environmental Protection Agency). (2006). Method 1600: Enterococci in water by membrane filtration using membrane-enterococcus indoxyl-β- d -glucoside agar (mEI). EPA-821-R-06-009. Washington: USEPA.
Whitman, R., & Nevers, M. B. (2005). Regional and local factors affecting patterns of E. coli distribution in southern Lake Michigan. Porter: U.S. Geological Survey.
Yan, H. (1993). Skew correction of document images using interline cross-correlation. Computer Vision, Graphics, and Image Processing, 55(6), 538–543.
Acknowledgments
The authors wish to thank Michael Tryby (USEPA) and anonymous referees for insightful review comments. We acknowledge the support of the City of Milwaukee Health Department, South Shore Yacht Club, and Daniel Feinstein. This paper has been reviewed in accordance with the U.S. Environmental Protection Agency's peer and administrative review policies and approved for publication. Mention of trade names or commercial products does not constitute endorsement or recommendation for use.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cyterski, M., Zhang, S., White, E. et al. Temporal Synchronization Analysis for Improving Regression Modeling of Fecal Indicator Bacteria Levels. Water Air Soil Pollut 223, 4841–4851 (2012). https://doi.org/10.1007/s11270-012-1240-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11270-012-1240-3