Abstract
Despite numerous applications of Random Forest (RF) techniques in the water-quality field, its use to detect first-flush (FF) events is limited. In this study, we developed a stormwater management framework based on RF algorithms and two different FF definitions (30/80 and M(V) curve). This framework can predict the FF intensity of a single rainfall event for three of the most detected pollutants in urban areas (TSS, TN, and TP), yielding satisfactory results (30/80: \(accuracy_{average}\) = 0.87; M(V) curve: \(accuracy_{average}\) = 0.75). Furthermore, the framework can quantify and rank the most critical variables based on their level of importance in predicting FF, using a non-model-biased method based on game theory. Compared to the classical physically-based models that require catchment and drainage information apart from meteorological data, our framework inputs only include rainfall-runoff variables. Furthermore, it is generic and independent from the data adopted in this study, and it can be applied to any other geographical region with a complete rainfall-runoff dataset. Therefore, the framework developed in this study is expected to contribute to accurate FF prediction, which can be exploited for the design of treatment systems aimed to store and treat the FF-runoff volume.
Similar content being viewed by others
Data Availability
The water quality dataset described in this article can be accessed from https://gitlab.com/fing-hydroinformatics/first-flush-rfc/-/tree/main/data.
Code Availability
The stormwater management framework developed for this work is freely available at https://gitlab.com/fing-hydroinformatics/first-flush-rfc. It was implemented in Phyton3 using Conda (two scripts, one for Linux and one for MS Windows, can be found to generate the software environment with all its requirements). This framework can be run in any general-purpose computer.
References
Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: a next-generation hyperparameter optimization framework. Association for Computing Machinery, New York, NY, USA, p 2623–2631. https://doi.org/10.1145/3292500.3330701
Alley WM, Smith PE (1981) Estimation of accumulation parameters for urban runoff quality modeling. Water Resour Res 17(6):1657–1664. https://doi.org/10.1029/WR017i006p01657
Baak M, Koopman R, Snoek H, Klous S (2020) A new correlation coefficient between categorical, ordinal and interval variables with pearson characteristics. Comput Stat Data Anal 152. https://doi.org/10.1016/j.csda.2020.107043
Baird R, Eaton A, Rice E (2017) Standard methods for the examination of water and wastewater, 23rd edn. American Public Health Association, American Water Works Association, and Water Environment Federation
Bertrand-Krajewski JL, Chebbo G, Saget A (1998) Distribution of pollutant mass vs volume in stormwater discharges and the first flush phenomenon. Water Res 32(8):2341–2356. https://doi.org/10.1016/S0043-1354(97)00420-X
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
Breiman L (2001) Random forests. Mach Learn 45:32–45. https://doi.org/10.1023/A:1010933404324
Creaco E, Berardi L, Sun S, Giustolisi O, Savic D (2016) Selection of relevant input variables in storm water quality modeling by multiobjective evolutionary polynomial regression paradigm. Water Resour Res 52(4):2403–2419. https://doi.org/10.1002/2015WR017971
Cross T, Sathaye K, Darnell K, Niederhut D, Crifasi K (2020) Predicting water production in the Williston basin using a machine learning model, p 3492–3503. https://doi.org/10.15530/urtec-2020-2756
Dams J, Dujardin J, Reggers R, Bashir I, Canters F, Batelaan O (2013) Mapping impervious surface change from remote sensing for hydrological modeling. J Hydrol 485:84–95. Hydrology of peri-urban catchments: Processes and modelling. https://doi.org/10.1016/j.jhydrol.2012.09.045
Di Modugno M, Gioia A, Gorgoglione A, Iacobellis V, La Forgia G, Piccinni AF, Ranieri E (2015) Build-up/wash-off monitoring and assessment for sustainable management of first flush in an urban area. Sustainability 7(5):5050–5070. https://doi.org/10.3390/su7055050
Egodawatta P, Thomas E, Goonetilleke A (2007) Mathematical interpretation of pollutant wash-off from urban road surfaces using simulated rainfall. Water Res 41(13):3025–3031. https://doi.org/10.1016/j.watres.2007.03.037
Egodawatta P, Thomas E, Goonetilleke A (2009) Understanding the physical processes of pollutant build-up and wash-off on roof surfaces. Sci Total Environ 407(6):1834–1841. https://doi.org/10.1016/j.scitotenv.2008.12.027
Geiger W (1984) Characteristics of combined sewer runoff. In: Proceeding de la 3ème conférence internationale «Urban Storm Drainage», Göteborg, p 4–8
Gnecco I, Berretta C, Lanza L, La Barbera P (2005) Storm water pollution in the urban environment of Genoa, Italy. Atmos Res 77(1):60–73. Precipitation in Urban Areas. https://doi.org/10.1016/j.atmosres.2004.10.017
Gorgoglione A, Gioia A, Iacobellis V, Piccinni AF, Ranieri E (2016) A rationale for pollutograph evaluation in ungauged areas, using daily rainfall patterns: Case studies of the Apulian region in Southern Italy. Appl Environ Soil Sci 2016. https://doi.org/10.1155/2016/9327614
Gorgoglione A, Bombardelli FA, Pitton BJL, Oki LR, Haver DL, Young TM (2018) Role of sediments in insecticide runoff from urban surfaces: Analysis and modeling. Int J Environ Res Pub Health 15(7). https://doi.org/10.3390/ijerph15071464
Gorgoglione A, Bombardelli FA, Pitton BJ, Oki LR, Haver DL, Young TM (2019) Uncertainty in the parameterization of sediment build-up and wash-off processes in the simulation of sediment transport in urban areas. Environ Model Software 111:170–181. https://doi.org/10.1016/j.envsoft.2018.09.022
Gorgoglione A, Gioia A, Iacobellis V (2019b) A framework for assessing modeling performance and effects of rainfall-catchment-drainage characteristics on nutrient urban runoff in poorly gauged watersheds. Sustainability 11(18). https://doi.org/10.3390/su11184933
Gorgoglione A, Castro A, Gioia A, Iacobellis V (2020a) Application of the self-organizing map (som) to characterize nutrient urban runoff. In: Gervasi O, Murgante B, Misra S, Garau C, Blečić I, Taniar D, Apduhan BO, Rocha AMAC, Tarantino E, Torre CM, Karaca Y (eds) Computational Science and its Applications – ICCSA 2020, Springer International Publishing, Cham, p 680–692. https://doi.org/10.1007/978-3-030-58811-3_49
Gorgoglione A, Gregorio J, Ríos A, Alonso J, Chreties C, Fossati M (2020b) Influence of land use/land cover on surface-water quality of Santa Lucía river, Uruguay. Sustainability 12(11). https://doi.org/10.3390/su12114692
Gorgoglione A, Castro A, Iacobellis V, Gioia A (2021) A comparison of linear and non-linear machine learning techniques (pca and som) for characterizing urban nutrient runoff. Sustainability 13(4). https://doi.org/10.3390/su13042054
Guan M, Sillanpää N, Koivusalo H (2015) Modelling and assessment of hydrological changes in a developing urban catchment. Hydrol Process 29(13):2880–2894. https://doi.org/10.1002/hyp.10410
Helsel DR, Kim JI, Grizzard TJ, Randall CW, Hoehn RC (1979) Land use influences on metals in storm drainage. J Water Pollut Control Fed 51(4):709–717
Hur S, Nam K, Kim J, Kwak C (2018) Development of urban runoff model ffc-qual for first-flush water-quality analysis in urban drainage basins. J Environ Manage 205:73–84. https://doi.org/10.1016/j.jenvman.2017.09.060
Jeung M, Baek SS, Beom J, Cho K, Her Y, Yoon K (2019) Evaluation of random forest and regression tree methods for estimation of mass first flush ratio in urban catchments. J Hydrol 575:1099–1110. https://doi.org/10.1016/j.jhydrol.2019.05.079
Kang JH, Kayhanian M, Stenstrom MK (2006) Implications of a kinematic wave model for first flush treatment design. Water Res 40(20):3820–3830. https://doi.org/10.1016/j.watres.2006.09.007
Lee JY, Kim H, Kim Y, Han MY (2011) Characteristics of the event mean concentration (emc) from rainfall runoff on an urban highway. Environ Pollut 159(4):884–888. https://doi.org/10.1016/j.envpol.2010.12.022
Li MH, Barrett ME (2008) Relationship between antecedent dry period and highway pollutant: Conceptual models of buildup and removal processes. Water Environ Res 80(8):740–747. https://doi.org/10.2175/106143008x296451
Liu A, Gunawardana C, Gunawardena J, Egodawatta P, Ayoko GA, Goonetilleke A (2016) Taxonomy of factors which influence heavy metal build-up on urban road surfaces. J Hazard Mater 310:20–29. https://doi.org/10.1016/j.jhazmat.2016.02.026
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems 30, Curran Associates, Inc., p 4765–4774
Padarian J, McBratney AB, Minasny B (2020) Game theory interpretation of digital soil mapping convolutional neural networks. Soil 6(2):389–397. https://doi.org/10.5194/soil-6-389-2020
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830
Perera T, McGree J, Egodawatta P, Jinadasa K, Goonetilleke A (2019) Taxonomy of influential factors for predicting pollutant first flush in urban stormwater runoff. Water Res 166. https://doi.org/10.1016/j.watres.2019.115075
Regione Puglia (2013) Regional Regulation, 9 December 2013, no. 26, “Stormwater runoff and first flush regulations” (implementation of article 13 of Legislative Decree no. 152/06 and subsequent amendments)
Rodríguez R, Pastorini M, Etcheverry L, Chreties C, Fossati M, Castro A, Gorgoglione A (2021) Water-quality data imputation with a high percentage of missing values: a machine learning approach. Sustainability 13(11). https://doi.org/10.3390/su13116318
Rossman LA (2015) Storm Water Management Model User’s Manual Version 5.1. U.S. Environmental Protection Agency (EPA), National Risk Management Research Laboratory Office of Research and Development U.S. Environmental Protection Agency, Cincinnati, OH, USA
Saget A, Chebbo G, Bertrand-Krajewski JL (1996) The first flush in sewer systems. Water Sci Technol 33(9):101–108. Solids in Sewers. https://doi.org/10.1016/0273-1223(96)00375-7
Sartor JD, Boyd GB, Agardy FJ (1974) Water pollution aspects of street surface contaminants. J Water Pollut Control Fed 46(3):458–467
Shapley LS (1997) A value for n-person games. Classics in game theory 69
SIT Puglia (2021) SIT Puglia. http://www.sit.puglia.it/. Accessed 15 Dec 2021
Sun A, Scanlon B (2019) How can big data and machine learning benefit environment and water management: a survey of methods, applications, and future directions. Environ Res Lett 14(7). https://doi.org/10.1088/1748-9326/ab1b7d
Uusitalo L, Lehikoinen A, Helle I, Myrberg K (2015) An overview of methods to evaluate uncertainty of deterministic models in decision support. Environ Model Software 63:24–31. https://doi.org/10.1016/j.envsoft.2014.09.017
Veneziano D, Iacobellis V (2002) Multiscaling pulse representation of temporal rainfall. Water Resources Research 38(8):13-1–13-13. https://doi.org/10.1029/2001WR000522
Veneziano D, Furcolo P, Iacobellis V (2002) Multifractality of iterated pulse processes with pulse amplitudes generated by a random cascade. Fractals 10(02):209–222. https://doi.org/10.1142/S0218348X02001026
Vilaseca F, Castro A, Chreties C, Gorgoglione A (2021) Daily rainfall-runoff modeling at watershed scale: A comparison between physically-based and data-driven models. In: Gervasi O, Murgante B, Misra S, Garau C, Blečić I, Taniar D, Apduhan BO, Rocha AMAC, Tarantino E, Torre CM (eds) Computational Science and Its Applications - ICCSA 2021. Springer International Publishing, Cham, pp 18–33
Wang F, Wang Y, Zhang K, Hu M, Weng Q, Zhang H (2021) Spatial heterogeneity modeling of water quality based on random forest regression and model interpretation. Environ Res 202. https://doi.org/10.1016/j.envres.2021.111660
Yang YY, Lusk MG (2018) Nutrients in urban stormwater runoff: Current state of the science and potential mitigation options. Curr Pollut Rep 4:112–127. https://doi.org/10.1007/s40726-018-0087-7
Zhong S, Zhang K, Wang D, Zhang H (2021) Shedding light on black box machine learning models for predicting the reactivity of ho radicals toward organic compounds. Chem Eng J 405. https://doi.org/10.1016/j.cej.2020.126627
Funding
Cosimo Russo’s visiting at Universidad de la República was partially funded by the exchange program “Tesi all’estero” from Politecnico di Milano.
Author information
Authors and Affiliations
Contributions
Cosimo Russo: Methodology, Software, Validation, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Visualization, Funding acquisition. Alberto Castro: Conceptualization, Methodology, Investigation, Writing - Review & Editing, Supervision, Funding acquisition. Andrea Gioia: Resources, Writing - Review & Editing. Vito Iacobellis: Resources, Writing - Review & Editing. Angela Gorgoglione: Conceptualization, Methodology, Investigation, Resources, Writing - Original Draft, Writing - Review & Editing, Supervision, Project administration, Funding acquisition. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics Approval
Not applicable.
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Competing Interests
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A. Supplementary information
Appendix A. Supplementary information
1.1 A.1 SWMM Model Description and Implementation
SWMM simulates the hydrograph and pollutograph for a real storm event (for a single and long-term event) based on the rainfall and other meteorological inputs, and system characteristics (catchment, conveyance, and storage/treatment) for urban and peri-urban watersheds. SWMM has been designed in blocks or operating units. Each block can be used individually or in a cascade, and an executive block coordinates its outputs. The runoff block, as well as the transport block, were utilized for this study. By using inlet hydrographs generated from the runoff unit, the transport block executes the flow and pollutant routing through the drainage network.
To simulate the runoff from urban surfaces, the kinematic-wave equation was chosen. Furthermore, the water losses taken into account were represented by the depression storage on the impervious portion of the watershed and the infiltration process. The latter was modeled by evaluating, for each subcatchment, the percentage of the impervious and pervious area obtained from the land-use map. The infiltration model adopted in this work was based on Horton’s equation, whose parameter values have been chosen according to the representative values reported in the literature in relation to soil type. Eight parameters of the runoff block of SWMM were used to calibrate the hydraulic-hydrologic model: the depth of depression storage on impervious (\(Dstore-Imperv\)) and pervious (\(Dstore-Perv\)) portions of the subcatchment, Manning’s coefficient for overland flow over the impervious (\(N-Imperv\)) and pervious (\(N-Perv\)) portions of the subcatchment, the percent of the impervious area without depression storage (\(\%Zero Imperv\)), and the infiltration parameters of Horton’s equation.
Pollutant build-up within a land-use category is described by a mass per unit of subcatchment area. The amount of build-up is a function of the number of dry weather days antecedent to the rainfall event. The build-up function follows a growth law that asymptotically approaches a maximum limit:
where \(M_a(d_{adp})\) represents the pollutant build-up during the antecedent dry period [kg/ha]; Disp is the parameter that measures the disappearance of accumulated solids due to the action of wind or vehicular traffic [1/d]; \(P_{imp}\) is the impervious area fraction; Accu the parameter that characterizes the solids build-up rate [kg/(ha d)]; \(\frac{Accu}{Disp} \cdot A \cdot P_{imp}\) presents the maximum asymptotic limit of the build-up curve. The pollutant wash-off over different land uses takes place during wet periods, and it is described by the differential equation:
where \(\frac{dM_d(t)}{dt}\) is the wash-off load rate [kg/h]; Arra is the wash-off coefficient [\(mm^{-1}\)]; i(t) is the runoff rate [mm/h]; wash is the wash-off exponent, a parameter that controls the influence of rainfall intensity on the amount of leached pollutants. Four parameters of the runoff block were identified for the calibration of the water-quality model. For the build-up function: the parameter that characterizes the solids build-up rate (Accu) and the parameter that identifies the disappearance of accumulated sediments due to the wind or vehicular traffic (Disp). For the wash-off function: the wash-off coefficient (Arra) and the wash-off exponent (wash).
1.2 A.2 Exploratory Data Analysis: Indices and Results
Spearman, Kendall and Phik coefficients are able to capture non-linear correlations. Spearman exploits monotonicity, while Kendall measures ordinal associations. Both coefficients have values in domain \([-1, 1]\), where -1 indicates a perfect negative correlation, +1 a perfectly positive correlation, and 0 no correlation. The formulas are defined in Eqs. (14) and (15), respectively.
For Spearman’s \(\rho\), \(d_i\) is the difference between the two ranks of each observation, and n is the number of observations. For Kendall’s \(\tau\), the definition of concordant and discordant pairs is needed: a pair of values \((x_i, y_i), (x_j, y_j), i < j\) is concordant if \(x_i < x_j\) and \(y_i < y_j\) or if \(x_i > x_j\) and \(y_i > y_j\).
The Phik coefficient Baak et al. (2020) is also a non-linear correlation coefficient that was refined to work consistently with continuous and categorical variables.
The corresponding correlation heatmaps are reported in Figs. 5 and 6
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Russo, C., Castro, A., Gioia, A. et al. A Stormwater Management Framework for Predicting First Flush Intensity and Quantifying its Influential Factors. Water Resour Manage 37, 1437–1459 (2023). https://doi.org/10.1007/s11269-023-03438-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-023-03438-8