Abstract
Understanding influent water quality variability is essential for the long-term planning of potable water systems. To quantify variability and generate realistic influent scenarios, we propose a nonparametric time series approach based on k-nearest neighbor (k-NN) bootstrap resampling. The k-NN approach resamples historical data conditioned on a “feature vector” at a given time to generate values at subsequent times. We modified this algorithm by adding random perturbations to the resampled values to generate realistic extremes unobserved in the historical record. k-NN is widely used in stochastic hydrology and hydroclimatology; however, it is adapted here for the multivariate, data-limited context of water treatment. To examine the performance of the algorithm, we applied it to an eleven-year, monthly water quality dataset of alkalinity, temperature, total organic carbon, and pH from the Cache la Poudre River in Colorado. We found that the k-NN simulations captured the relevant distributional statistics of the historical record, which suggests that the algorithm produces realistic and varied scenarios. When used in conjunction with modeling and optimization, these scenarios have the potential to improve the sustainability, resilience, and efficiency of potable water systems.
Similar content being viewed by others
References
Baxter CW, Stanley SJ, Zhang Q (1999) Development of a full-scale artificial neural network model for the removal of natural organic matter by enhanced coagulation. J Water Serv Res Technol Aqua 48:129–136
Benke KK, Hamilton AJ (2008) Quantitative microbial risk assessment: uncertainty and measures of central tendency for skewed distributions. Stoch Environ Res Risk Assess 22:533–539. https://doi.org/10.1007/s00477-007-0171-9
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken
Bras RL, Rodríguez-Iturbe I (1985) Random Functions and Hydrology. Courier Corporation, North Chelmsford
Brookes JD, Carey CC, Hamilton DP, Ho L, van der Linden L, Renner R, Rigosi A (2014) Emerging challenges for the drinking water industry. Environ Sci Technol 48:2099–2101. https://doi.org/10.1021/es405606t
Delpla I, Rodriguez MJ (2014) Effects of future climate and land use scenarios on riverine source water quality. Sci Total Environ 493:1014–1024
Delpla I, Jung A-V, Baures E, Clement M, Thomas O (2009) Impacts of climate change on surface water quality in relation to drinking water production. Environ Int 35:1225–1233. https://doi.org/10.1016/j.envint.2009.07.001
Haimes YY (2015) Risk modeling, assessment, and management. Wiley, Hoboken
Harrington GW, Chowdhury ZK, Owen DM (1992) Developing a computer model to simulate dbp formation during water treatment. J Am Water Works Assoc 84:78–87
Hyndman RJ, Khandakar Y (2007) Automatic time series for forecasting: the forecast package for R. Monash University, Department of Econometrics and Business Statistics.
Hyndman RJ, Athanasopoulos G, Bergmeir C, Caceres G, Chhay L, O’Hara-Wild M, Petropoulos F, Razbash S, Wang E, Yasmeen F (2018) forecast: forecasting functions for time series and linear models. R package.
IPCC (2014) Climate Change 2014: synthesis report. Contribution of working groups I, II and III to the fifth assessment report of the intergovernmental panel on climate change.
Khalili M, Brissette F, Leconte R (2009) Stochastic multi-site generation of daily weather data. Stoch Environ Res Risk Assess 23:837–849. https://doi.org/10.1007/s00477-008-0275-x
Khan SJ, Deere D, Leusch FD, Humpage A, Jenkins M, Cunliffe D, Fitzgerald SK, Stanford BD (2017) Lessons and guidance for the management of safe drinking water during extreme weather events. Environ Sci Water Res Technol 3(2):262–77. https://doi.org/10.1039/C6EW00165C
Lall U, Sharma A (1996) A nearest neighbor bootstrap for resampling hydrologic time series. Water Resour Res 32:679–693. https://doi.org/10.1029/95WR02966
Lee T, Ouarda TBMJ (2011) Identification of model order and number of neighbors for k-nearest neighbor resampling. J Hydrol 404:136–145. https://doi.org/10.1016/j.jhydrol.2011.04.024
Li Z, Clark RM, Buchberger SG, Jeffrey Yang Y (2014) Evaluation of climate change impact on drinking water treatment plant operation. J Environ Eng 140:A4014005. https://doi.org/10.1061/(ASCE)EE.1943-7870.0000824
Maass A, Hufschmidt MM, Dorfman R, Thomas HA, Marglin SA, Fair GM, Bower BT, Reedy WW, Manzer DF, Barnett MP (1962) Design of water resource systems; new techniques for relating economic objectives, engineering analysis and governmental planning. Harvard University Press
Mahalanobis PC (1936) On the generalized distance in statistics. National Institute of Science of India
Maier HR, Morgan N, Chow CWK (2004) Use of artificial neural networks for predicting optimal alum doses and treated water quality parameters. Environ Model Softw 19:485–494. https://doi.org/10.1016/S1364-8152(03)00163-4
Modarres R (2007) Streamflow drought time series forecasting. Stoch Environ Res Ris Assess 21:223–233. https://doi.org/10.1007/s00477-006-0058-1
Quinn JD, Reed PM, Giuliani M, Castelletti A (2017) Rival framings: a framework for discovering how problem formulation uncertainties shape risk management trade-offs in water resources systems. Water Resour Res 53:7208–7233. https://doi.org/10.1002/2017WR020524
Rajagopalan B, Lall U (1999) A k-nearest-neighbor simulator for daily precipitation and other weather variables. Water Resour Res 35:3089–3101
Raseman WJ, Kasprzyk JR, Rosario-Ortiz FL, Stewart JR, Livneh B (2017) Emerging investigators series: a critical review of decision support systems for water treatment: making the case for incorporating climate change and climate extremes. Environ Sci Water Res Technol 3:18–36. https://doi.org/10.1039/C6EW00121A
Rietveld LC, van der Helm AWC, van Schagen KM, van der Aa LTJ (2010) Good modelling practice in drinking water treatment, applied to Weesperkarspel plant of Waternet. Environ Model Softw Thematic Issue Model Autom Water Wastewater Treat Process 25:661–669. https://doi.org/10.1016/j.envsoft.2009.05.015
Samson CC, Rajagopalan B, Summers RS (2016) Modeling Source Water TOC Using Hydroclimate Variables and Local Polynomial Regression. Environ Sci Technol 50:4413–4421. https://doi.org/10.1021/acs.est.6b00639
Santana MVE, Zhang Q, Mihelcic JR (2014) Influence of water quality on the embodied energy of drinking water treatment. Environ Sci Technol 48:3084–3091. https://doi.org/10.1021/es404300y
Sharif M, Burn D (2007) Improved K -nearest neighbor weather generating model. J Hydrol Eng 12:42–51. https://doi.org/10.1061/(ASCE)1084-0699(2007)12:1(42)
Sharma A, O’Neill R (2002) A nonparametric approach for representing interannual dependence in monthly streamflow sequences. Water Resourc Res 38(7):1100. https://doi.org/10.1029/2001WR000953
Sharma A, Tarboton DG, Lall U (1997) Streamflow simulation: a nonparametric approach. Water Resour Res 33:291–308
Silverman BW (1986) Density estimation for statistics and data analysis. CRC Press, Boca Raton
Terrell GR, Scott DW (1992) Variable kernel density estimation. Ann Stat 20:1236–1265
Thomas, H.A., Fiering, M.B., 1962. Mathematical synthesis of streamflow sequences for the analysis of river basin by simulation. In: Design of water resources-systems pp 459–493.
Towler E, Rajagopalan B, Seidel C, Summers RS (2009) Simulating ensembles of source water quality using a k-nearest neighbor resampling approach. Environ Sci Technol 43:1407–1411. https://doi.org/10.1021/es8021182
Vogel RM, Shallcross AL (1996) The moving blocks bootstrap versus parametric time series models. Water Resour Res 32(6):1875–82
Ward VL, Singh R, Reed PM, Keller K (2015) Confronting tipping points: Can multi-objective evolutionary algorithms discover pollution control tradeoffs given environmental thresholds? Environ Model Softw 73:27–43. https://doi.org/10.1016/j.envsoft.2015.07.020
Worm GIM, van der Helm AWC, Lapikas T, van Schagen KM, Rietveld LC (2010) Integration of models, data management, interfaces and training support in a drinking water treatment plant simulator. Environ Model Softw Thematic Issue Model Autom Water Wastewater Treat Process 25:677–683. https://doi.org/10.1016/j.envsoft.2009.05.011
Yates D, Gangopadhyay S, Rajagopalan B, Strzepek K (2003) A technique for generating regional climate scenarios using a nearest-neighbor algorithm. Water Resour Res. https://doi.org/10.1029/2002WR001769
Acknowledgements
The authors would like to thank Jill Oropeza and Jared Health at City of Fort Collins Utilities for providing the Cache la Poudre water quality data. This work was supported by the U.S. Environmental Protection Agency “National Priorities: Systems-Based Strategies to Improve the Nation's Ability to Plan and Respond to Water Scarcity and Drought Due to Climate Change”, Grant No. R835865. The contents of this manuscript are solely the responsibility of the grantee and do not necessarily represent the official views of the US EPA.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Raseman, W.J., Rajagopalan, B., Kasprzyk, J.R. et al. Nearest neighbor time series bootstrap for generating influent water quality scenarios. Stoch Environ Res Risk Assess 34, 23–31 (2020). https://doi.org/10.1007/s00477-019-01762-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-019-01762-3