Skip to main content
Log in

Nearest neighbor time series bootstrap for generating influent water quality scenarios

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Understanding influent water quality variability is essential for the long-term planning of potable water systems. To quantify variability and generate realistic influent scenarios, we propose a nonparametric time series approach based on k-nearest neighbor (k-NN) bootstrap resampling. The k-NN approach resamples historical data conditioned on a “feature vector” at a given time to generate values at subsequent times. We modified this algorithm by adding random perturbations to the resampled values to generate realistic extremes unobserved in the historical record. k-NN is widely used in stochastic hydrology and hydroclimatology; however, it is adapted here for the multivariate, data-limited context of water treatment. To examine the performance of the algorithm, we applied it to an eleven-year, monthly water quality dataset of alkalinity, temperature, total organic carbon, and pH from the Cache la Poudre River in Colorado. We found that the k-NN simulations captured the relevant distributional statistics of the historical record, which suggests that the algorithm produces realistic and varied scenarios. When used in conjunction with modeling and optimization, these scenarios have the potential to improve the sustainability, resilience, and efficiency of potable water systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Baxter CW, Stanley SJ, Zhang Q (1999) Development of a full-scale artificial neural network model for the removal of natural organic matter by enhanced coagulation. J Water Serv Res Technol Aqua 48:129–136

    Article  CAS  Google Scholar 

  • Benke KK, Hamilton AJ (2008) Quantitative microbial risk assessment: uncertainty and measures of central tendency for skewed distributions. Stoch Environ Res Risk Assess 22:533–539. https://doi.org/10.1007/s00477-007-0171-9

    Article  Google Scholar 

  • Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken

    Google Scholar 

  • Bras RL, Rodríguez-Iturbe I (1985) Random Functions and Hydrology. Courier Corporation, North Chelmsford

    Google Scholar 

  • Brookes JD, Carey CC, Hamilton DP, Ho L, van der Linden L, Renner R, Rigosi A (2014) Emerging challenges for the drinking water industry. Environ Sci Technol 48:2099–2101. https://doi.org/10.1021/es405606t

    Article  CAS  Google Scholar 

  • Delpla I, Rodriguez MJ (2014) Effects of future climate and land use scenarios on riverine source water quality. Sci Total Environ 493:1014–1024

    Article  CAS  Google Scholar 

  • Delpla I, Jung A-V, Baures E, Clement M, Thomas O (2009) Impacts of climate change on surface water quality in relation to drinking water production. Environ Int 35:1225–1233. https://doi.org/10.1016/j.envint.2009.07.001

    Article  CAS  Google Scholar 

  • Haimes YY (2015) Risk modeling, assessment, and management. Wiley, Hoboken

    Google Scholar 

  • Harrington GW, Chowdhury ZK, Owen DM (1992) Developing a computer model to simulate dbp formation during water treatment. J Am Water Works Assoc 84:78–87

    Article  CAS  Google Scholar 

  • Hyndman RJ, Khandakar Y (2007) Automatic time series for forecasting: the forecast package for R. Monash University, Department of Econometrics and Business Statistics.

  • Hyndman RJ, Athanasopoulos G, Bergmeir C, Caceres G, Chhay L, O’Hara-Wild M, Petropoulos F, Razbash S, Wang E, Yasmeen F (2018) forecast: forecasting functions for time series and linear models. R package.

  • IPCC (2014) Climate Change 2014: synthesis report. Contribution of working groups I, II and III to the fifth assessment report of the intergovernmental panel on climate change.

  • Khalili M, Brissette F, Leconte R (2009) Stochastic multi-site generation of daily weather data. Stoch Environ Res Risk Assess 23:837–849. https://doi.org/10.1007/s00477-008-0275-x

    Article  Google Scholar 

  • Khan SJ, Deere D, Leusch FD, Humpage A, Jenkins M, Cunliffe D, Fitzgerald SK, Stanford BD (2017) Lessons and guidance for the management of safe drinking water during extreme weather events. Environ Sci Water Res Technol 3(2):262–77. https://doi.org/10.1039/C6EW00165C

    Article  Google Scholar 

  • Lall U, Sharma A (1996) A nearest neighbor bootstrap for resampling hydrologic time series. Water Resour Res 32:679–693. https://doi.org/10.1029/95WR02966

    Article  Google Scholar 

  • Lee T, Ouarda TBMJ (2011) Identification of model order and number of neighbors for k-nearest neighbor resampling. J Hydrol 404:136–145. https://doi.org/10.1016/j.jhydrol.2011.04.024

    Article  Google Scholar 

  • Li Z, Clark RM, Buchberger SG, Jeffrey Yang Y (2014) Evaluation of climate change impact on drinking water treatment plant operation. J Environ Eng 140:A4014005. https://doi.org/10.1061/(ASCE)EE.1943-7870.0000824

    Article  CAS  Google Scholar 

  • Maass A, Hufschmidt MM, Dorfman R, Thomas HA, Marglin SA, Fair GM, Bower BT, Reedy WW, Manzer DF, Barnett MP (1962) Design of water resource systems; new techniques for relating economic objectives, engineering analysis and governmental planning. Harvard University Press

  • Mahalanobis PC (1936) On the generalized distance in statistics. National Institute of Science of India

  • Maier HR, Morgan N, Chow CWK (2004) Use of artificial neural networks for predicting optimal alum doses and treated water quality parameters. Environ Model Softw 19:485–494. https://doi.org/10.1016/S1364-8152(03)00163-4

    Article  Google Scholar 

  • Modarres R (2007) Streamflow drought time series forecasting. Stoch Environ Res Ris Assess 21:223–233. https://doi.org/10.1007/s00477-006-0058-1

    Article  Google Scholar 

  • Quinn JD, Reed PM, Giuliani M, Castelletti A (2017) Rival framings: a framework for discovering how problem formulation uncertainties shape risk management trade-offs in water resources systems. Water Resour Res 53:7208–7233. https://doi.org/10.1002/2017WR020524

    Article  Google Scholar 

  • Rajagopalan B, Lall U (1999) A k-nearest-neighbor simulator for daily precipitation and other weather variables. Water Resour Res 35:3089–3101

    Article  Google Scholar 

  • Raseman WJ, Kasprzyk JR, Rosario-Ortiz FL, Stewart JR, Livneh B (2017) Emerging investigators series: a critical review of decision support systems for water treatment: making the case for incorporating climate change and climate extremes. Environ Sci Water Res Technol 3:18–36. https://doi.org/10.1039/C6EW00121A

    Article  Google Scholar 

  • Rietveld LC, van der Helm AWC, van Schagen KM, van der Aa LTJ (2010) Good modelling practice in drinking water treatment, applied to Weesperkarspel plant of Waternet. Environ Model Softw Thematic Issue Model Autom Water Wastewater Treat Process 25:661–669. https://doi.org/10.1016/j.envsoft.2009.05.015

    Article  Google Scholar 

  • Samson CC, Rajagopalan B, Summers RS (2016) Modeling Source Water TOC Using Hydroclimate Variables and Local Polynomial Regression. Environ Sci Technol 50:4413–4421. https://doi.org/10.1021/acs.est.6b00639

    Article  CAS  Google Scholar 

  • Santana MVE, Zhang Q, Mihelcic JR (2014) Influence of water quality on the embodied energy of drinking water treatment. Environ Sci Technol 48:3084–3091. https://doi.org/10.1021/es404300y

    Article  CAS  Google Scholar 

  • Sharif M, Burn D (2007) Improved K -nearest neighbor weather generating model. J Hydrol Eng 12:42–51. https://doi.org/10.1061/(ASCE)1084-0699(2007)12:1(42)

    Article  Google Scholar 

  • Sharma A, O’Neill R (2002) A nonparametric approach for representing interannual dependence in monthly streamflow sequences. Water Resourc Res 38(7):1100. https://doi.org/10.1029/2001WR000953

    Article  Google Scholar 

  • Sharma A, Tarboton DG, Lall U (1997) Streamflow simulation: a nonparametric approach. Water Resour Res 33:291–308

    Article  Google Scholar 

  • Silverman BW (1986) Density estimation for statistics and data analysis. CRC Press, Boca Raton

    Book  Google Scholar 

  • Terrell GR, Scott DW (1992) Variable kernel density estimation. Ann Stat 20:1236–1265

    Article  Google Scholar 

  • Thomas, H.A., Fiering, M.B., 1962. Mathematical synthesis of streamflow sequences for the analysis of river basin by simulation. In: Design of water resources-systems pp 459–493.

  • Towler E, Rajagopalan B, Seidel C, Summers RS (2009) Simulating ensembles of source water quality using a k-nearest neighbor resampling approach. Environ Sci Technol 43:1407–1411. https://doi.org/10.1021/es8021182

    Article  CAS  Google Scholar 

  • Vogel RM, Shallcross AL (1996) The moving blocks bootstrap versus parametric time series models. Water Resour Res 32(6):1875–82

    Article  Google Scholar 

  • Ward VL, Singh R, Reed PM, Keller K (2015) Confronting tipping points: Can multi-objective evolutionary algorithms discover pollution control tradeoffs given environmental thresholds? Environ Model Softw 73:27–43. https://doi.org/10.1016/j.envsoft.2015.07.020

    Article  Google Scholar 

  • Worm GIM, van der Helm AWC, Lapikas T, van Schagen KM, Rietveld LC (2010) Integration of models, data management, interfaces and training support in a drinking water treatment plant simulator. Environ Model Softw Thematic Issue Model Autom Water Wastewater Treat Process 25:677–683. https://doi.org/10.1016/j.envsoft.2009.05.011

    Article  Google Scholar 

  • Yates D, Gangopadhyay S, Rajagopalan B, Strzepek K (2003) A technique for generating regional climate scenarios using a nearest-neighbor algorithm. Water Resour Res. https://doi.org/10.1029/2002WR001769

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Jill Oropeza and Jared Health at City of Fort Collins Utilities for providing the Cache la Poudre water quality data. This work was supported by the U.S. Environmental Protection Agency “National Priorities: Systems-Based Strategies to Improve the Nation's Ability to Plan and Respond to Water Scarcity and Drought Due to Climate Change”, Grant No. R835865. The contents of this manuscript are solely the responsibility of the grantee and do not necessarily represent the official views of the US EPA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William J. Raseman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 1589 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Raseman, W.J., Rajagopalan, B., Kasprzyk, J.R. et al. Nearest neighbor time series bootstrap for generating influent water quality scenarios. Stoch Environ Res Risk Assess 34, 23–31 (2020). https://doi.org/10.1007/s00477-019-01762-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-019-01762-3

Navigation