Abstract
The goal of this study is to investigate the uncertainty of an urban sewer system’s response under various rainfall and infrastructure scenarios by applying a recently developed nonparametric copulabased simulation approach to extreme rainfall fields. The approach allows for Monte Carlo simulation of multiple variables with differing marginal distributions and arbitrary dependence structure. The independent and identically distributed daily extreme rainfall events of the corresponding urban area, extracted from nationwide high resolution radar data stage IV, are the inputs of the spatial simulator. The simulated extreme rainfall fields were used to calculate excess runoff using the Natural Resources Conservation Service’s approach. New York City is selected as a case study and the results highlight the importance of preserving the spatial dependence of rainfall fields between the grids, even for simplified hydrologic models. This study estimates the probability of combined sewer overflows under extreme rainfall events and identifies the most effective locations in New York City to install green infrastructure for detaining excess stormwater runoff. The results of this study are beneficial for planners working on stormwater management and the approach is broadly applicable because it does not rely on extensive sewer system information.
Introduction
Urban pluvial flooding is determined by the interaction of the spatial layout of urban drainage infrastructure and the spatiotemporal structure of rainfall (e.g., Smith et al. 2002, 2005; Ramos et al. 2005; Morin et al. 2006; Wright et al. 2013; Yang et al. 2013). Therefore, proper representation of the meteorological forcing of urban hydrologic systems is an essential aspect of predicting the performance of the underlying drainage infrastructure. Simulations that reproduce the space–time patterns of rainfall associated with preferred storm speeds and tracks can be used to improve the performance assessment, operation, and design of urban drainage infrastructure (Singh 1997; McRobie et al. 2013). The goal of this study is to credibly simulate extreme rainfall fields in order to quantify the uncertainty of the urban sewer system under different rainfall and infrastructure scenarios.
A common shortcoming of many efforts to assess pluvial flooding is the use of spatially uniform design storms estimated based on the return period of point rainfall data (e.g., Berne et al. 2004; Zhou et al. 2012; Notaro et al. 2013; Gires et al. 2015). Neglecting the spatial variation in the precipitation field, however, is an oversimplification in many cases and does not capture the variation in flood response to different spatial precipitation distributions, even when the design storm return period is fixed (Wheater et al. 2005; Simões et al. 2015). Until relatively recently, however, the spatial distributions of storms at suburban scales were not well measured.
A high spatial–temporal resolution dataset for precipitation is now available for most of the United States and can be implemented in hydrologic analysis of dense urban environments (e.g., Smith et al. 2002; Gourley et al. 2014). While the tools for statistical modeling of pointbased extreme events are welldeveloped (e.g., Coles et al. 2001), extending these tools to model spatial extreme data is an active area of research. Some approaches use rainfall generators to simulate precipitation fields and investigate the urban drainage design using parametric (Willems 2001a; McRobie et al. 2013; Nuswantoro et al. 2014; Simões et al. 2015) or nonparametric approaches (Harrold et al. 2003; Mehrotra et al. 2015). Parametric simulation approaches (e.g. Apipattanavis et al. 2007; Chen et al. 2011) typically require assuming that the precipitation data is multivariate normal and often do not preserve cross dependencies that exist in the data. Existing nonparametric approaches, on the other hand, often do not consider the spatial dependence of the rainfall field between the grids (e.g., Harrold et al. 2003). The nonparametric simulation approach employed in this study has the advantage of avoiding assumptions about the data distribution, while preserving the empirical spatial dependence based on the historic extreme precipitation fields.
In this study, we applied the multivariate simulation method described by Lall et al. (2016) on extreme rainfall fields for the first time. The procedure allows Monte Carlo simulation of multiple variables with differing marginal and joint distributions. We used daily radarderived rainfall data (spatial resolution of 4 km by 4 km) to identify extreme rainfall fields and imported them into the simulator. The simulated extreme rainfall fields and city infrastructure information were used to compute excess runoff through the Natural Resources Conservation Service (NRCS) approach. We introduced an innovative framework based on simulation approach that accounts for the spatial structure of extreme precipitation. That is, we did not use a parametric spatial model, but instead preserved the empirical dependence between grid cells via the method developed in Lall et al. (2016). We used this simulation model to estimate both the excess runoff under the city infrastructure change (addressing source control stormwater management) and the probability of exceeding the treatment capacity of the city under different rainfall scenarios (addressing endofpipe stormwater management). We applied the framework to New York City (NYC) as a case study which has been facing the challenge of frequent extreme weather events, sewer system overflow, and flooding (Spierre and Wake 2010; Environmental Protection Bureau of the NYS 2014). Accounting for spatial structure is especially important for extreme rainfall in NYC given the distinct spatial patterns that has been shown to exist there (Hamidi et al. 2017). The results of this study are beneficial for planners working on stormwater management and the approach is broadly applicable because it does not rely on extensive sewer system information (i.e., catch basins’ exact locations, size and connections of the sewer pipes, etc.) as do many other urban stormwater model (e.g. Pina et al. 2016).
The paper is organized as follows. The study area and data are introduced in Sect. 2. In Sect. 3, the methodology to simulate extreme rainfall fields and compute excess runoff are described. In Sect. 4, the results are presented and discussed. Finally, we provide the conclusions of the study in Sect. 5.
Case study
Average annual rainfall in NYC has increased nearly 20 mm in the last century (http://www.nyc.gov/dep) and climate projections indicate the potential for increasingly frequent intense storms (Horton et al. 2010). These facts make the City a compelling case study for urban hydrology (e.g., Rosenzweig et al. 2007; Cherrier et al. 2016). Today, much of the stormwater in NYC flows over impervious surfaces, which cover approximately 72% of NYC’s ~ 790 km^{2} land area, into roof drains or catch basins located on street and highway curbs and into the sewers (NYCDEP 2012a). More than 60% of NYC’s sewer system is combined, meaning it is used to convey both sanitary and storm flows. During heavy rainfall (or rapid snowmelt) events, combined sewers receive higher than normal flows. This can often result in the discharge of a mix of excess stormwater and untreated wastewater directly into the City’s waterways from outfalls to prevent upstream flooding because the Waste Water Treatment Plants (WWTP) are unable to treat the sewer flows that are more than twice their design capacity. This untreated release is called a Combined Sewer Overflow (CSO). CSOs are a concern because of their negative effects on water quality in local waterways (Cherrier et al. 2016). For example, during 2004–2005 there were 35 CSO recorded at two outfall locations of the Bronx River in the Bronx, NYC (De Sousa et al. 2012).
For the nearly 40% of NYC’s sewer system that is separated, stormwater runoff is discharged directly into the City’s waterways while sewages (e.g., industrial, commercial) are routed to the WWTPs. These portions of the system are running under NYC Municipal Separate Storm Sewer System plan (NYC MS4 Progress Report 2016). In addition to the separated and combined systems, small portions of NYC close to the rivers have direct drainage systems where untreated amounts of water are conveyed directly into waterways. Figure 1a indicates the divisions of NYC sewer system. Figure 1a also shows the location of the study area and the surrounding lands and rivers.
Currently the WWTPs in NYC are designed based on simulated flowrates derived with rainfall from 4 rain gauge stations: Central Park (CP), LaGuardia Airport (LGA), John F. Kennedy International Airport (JFK), and Newark International Airport (EWR), as shown in Fig. 1b. Rainfall rates are assumed to be uniform across each group of the WWTP drainage areas independently. Therefore, the current NYC Department of Environmental Protection (DEP) calibration model does not consider the spatial variation and spatial dependence among and between the drainage areas, respectively. The hypothesis of this study is that considering the spatial variation and dependence of extreme rainfall between grid cells will produce more realistic design criteria, particularly given the spatial clustering of extreme rainfall shown in Hamidi et al. (2017).
Data
NYC sewer system data
There are 14 WWTPs located along the coast and waterways in NYC (Fig. 1c). Each of the WWTPs is sized based on a Design Dry Weather Flow (DDWF), with total plant treatment capacity equal to 2 × DDWF. This results in a citywide treatment capacity of approximately 8.4 × 10^{6} m^{3}/day (1845 MGD). Figure 1c indicates the division of sewersheds and the locations of the 14 WWTPs. Locations of the 451 combined sewer outfalls are shown in Fig. 1d. This data is available in the 2012 DEP report (NYCDEP 2012a) and in Open Source Atlas NYC (https://openseweratlas.tumblr.com). The land area normalized treatment capacity (i.e., \(2 \times DDWF/Area\)) for the full city is 27.2 mm/day. We made the simplifying assumption that all of the NYC sewer system is combined.
According to the available open sources data of the City (shared at http://www.arcgis.com), there are an average of 3110 catch basins per sewershed. The catch basins installed in NYC have a small storage volume of ~ 1.6 m^{3} each (NYCDEP 2009), with the potential of storing ~ 7×10^{4} m^{3} (18.5 MG) runoff per event that should be considered in calculating the runoff. Table 1 lists all 14 WWTPs, their corresponding sewershed area, the number of catch basins, and the radar grid cell numbers which are described in the next section.
Radar rainfall data
The Next Generation Weather Radar system (NEXRAD) is comprised of 160 Weather Surveillance Radar1988 Doppler (WSR88D) sites throughout the United States and at select overseas locations (Heiss et al. 1990). While single radar records may suffer from blockage at certain locations (Vivekanandan et al. 1999; Lang et al. 2009) as well as range limitations, multisensor (gauge, radar, and satellite) products minimize these errors (Miller et al. 2010). Multisensor precipitation estimator algorithms provide a realtime suite of gridded products at different spatial scales (Kitzmiller et al. 2013). In this study, the NCEP (National Centers for Environmental Prediction) Stage IV radar product was employed to generate extreme rainfall fields over the NYC area. Stage IV radar data is mosaicked from the regional multisensor precipitation. This data is calibrated and adjusted for biases using automatic rain gauge measurements and quality control processes (Lin and Mitchell 2005). The data is reported at a spatial resolution of 4 km by 4 km and a temporal resolution of 1h and is available in Earth Observing Laboratory (http://data.eol.ucar.edu) from 2002 to present. Figure 2a shows the 76 radar grid cells that cover the entire land area of NYC (also see Table 1). Daily radar data from 2002 to 2015 (14 years) was used in this study to identify and simulate extreme rainfall fields in NYC. The Stage IV radar rainfall has been used in urban pluvial flood analysis research before (e.g., Gourley et al. 2014).
Rain gauge rainfall data
Rain gauge observation data was used in this study to provide a comparison for the radar data analysis. We used daily data from the four rain gauge stations cited in the introduction: CP, LGA, JFK, and EWR (see Fig. 1b). The data is archived at and available from the National Climatic Data Center (NCDC). We used the same time frame as for the radar data (i.e., 2002–2015).
Land cover and permeability data
High resolution spatially distributed land cover data for NYC was provided by the Department of Parks and Recreation in 2010 (https://data.cityofnewyork.us/) at a spatial resolution of ~ 0.9 m (3 ft) by 0.9 m. There are different estimates of surface infiltration, and thus various ways to compute runoff (e.g., the Horton 1933 equation). We used the NRCS approach for urban areas runoff estimation (Cronshey 1986), as proposed by the United States Department of Agriculture (USDA). The NRCS, formerly the Soil Conservation Service, developed runoff Curve Number (CN) from empirical analysis of small catchment runoff. CN represents the hydrologic soil cover complex of the watershed with respect to the soil type, land use, surface condition, and the Antecedent Moisture Condition (AMC). Three levels of AMC are considered: AMC_{I} dry soil (but not to the wilting point), AMC_{II} average case, and AMC_{III} saturated soil. The development of these procedures is outlined in NEH4 (National Engineering Handbook, Section 4—Hydrology, Soil Conservation Service 1985), and briefly explained in Sect. 3 of this paper. Generally, for impervious and water surfaces, CN = 100 and 0 < CN < 100 for natural surfaces. We converted the land cover data to the CN values for the normal moisture condition (AMC_{II}), which is also consistent with the runoff coefficients of NYC DEP (NYCDEP 2012b). The average CN value for each intersected area of radar grid cells with the sewershed borders is demonstrated in Fig. 2b.
Methodology
Generating extreme rainfall fields
The extreme rainfall fields for this study were generated as follows:

1.
The 95th percentile (R_{95}) rainfall for each grid cell was computed (only nonzero rainy days were considered). The average of R_{95} among the 76 grid cells for NYC is ~ 31 mm/day (1.2 inch/day) with standard deviation of ~ 1.6 mm/day (0.06 inch/day).

2.
Daily extreme events were identified as any day when any of the 76 grid cells exceeds its R_{95}. This resulted in a total of 266 unique extreme event days for NYC (i.e., an extreme event occurs on average every ~ 19 days). The average of maximum rainfall among the 266 events is equal to ~ 52 mm/day (2.1 inch/day) with standard deviation of ~ 24 mm/day (0.96 inch/day).

3.
The 5day antecedent rainfall at each grid cell was computed for each event. The average of maximum antecedent rainfall is ~ 5.5 mm/day (0.2 inch/day) with standard deviation of ~ 6 mm/day (0.24 inch/day). This data is used to compute the runoff as explained later in this section. 14% of extreme events occurred during boreal winter (Dec–Feb), 18% during spring (Mar–May), 41% during summer (JunAug), and 27% during autumn (Sep–Nov).
The spatial dependence of grid cell rainfall is investigated in Fig. 3 by demonstrating the percentage of concurrent extreme rainfall events occurring at the grid cells. The dark blue shows that the corresponding grid cells experienced precipitation greater than R_{95} for many of the same extreme events. The concentrated areas of dark blue shading along the diagonal of Fig. 3 illustrates that grid cells corresponding to the same sewersheds are highly dependent (see Table 1 for grid cell index locations). There is also spatial dependence between sewersheds, as illustrated by the offdiagonal areas shaded with dark blue. For example, there is about a 75% chance that an extreme event was present at the Jamaica—JAM grids (G16–G31) given that an extreme event was present at the Red Hook—RH grids (G55–G61). The probability of ~ 75% is determined by averaging the grid cells’ corresponding values fall in the intersection of parallel lines in Fig. 3.
The rain gauge data for the same period (2002–2015) was processed in the following:

1.
The 95th percentile, \(R_{95}^{\prime}\), of precipitation was identified at each rain gauge station as the extreme rainfall threshold. The average of the \(R_{95}^{\prime}\), threshold among 4 stations is ~ 36 mm/day (1.5 inch/day).

2.
The extreme events were identified for each station by applying the corresponding \(R_{95}^{\prime}\), threshold. This resulted in 89, 85, 84, and 86 extreme event days at CP, LGA, JFK, and EWR stations, respectively. The average of maximum rainfall among the four stations was ~ 180 mm/day (7.1 inch/day). The different number of events at each station is an artifact of the decision to estimate \(R_{95}^{\prime}\) of only rainy days (the total number of which is not constant across the four stations).

3.
The 5day antecedent rainfall was calculated. The average of antecedent rainfall among the four stations was ~ 40 mm/day (1.6 inch/day). 13% of events occurred during boreal winter, 23% during spring, 36% during summer, and 28% during autumn, similar to the seasonality of the radarderived extreme rainfall days.
Spatial simulator algorithm
Multivariate simulations are often necessary for risk analysis (Rajagopalan et al. 1997; Vogl et al. 2012; Lall et al. 2016; Xu et al. 2017). In such a case, the dependencies between all variables (here, the individual rainfall grid cells), which define the spatial field (here, the extreme rainfall field) should be preserved by the simulation framework. This is because the use of a simple univariate approach could lead to considerable over or underestimation of the risk associated with a given event (RaynalVillasenor and Salas 1987; Bruneau et al. 1994). Furthermore, the use of standard multivariate distributions with Gaussian structure is not reasonable if the marginal distributions are nonnormal (e.g., heavy tailed asymmetric distributions: Titterington et al. 1985; West 1992; Meylan et al. 2012). Copulas have been shown to be a useful way to model the dependence structure independent of the marginal distributions (Sklar 1959), which more easily allows one to model dependent, nonGaussian data, as is the case here.
Let F(X) be a joint distribution of multiple random variables \(x = \left( {x_{1} ,x_{2} , \ldots x_{m} } \right)\) and F(x_{i}) is the marginal distribution function for variable x_{i}, where i goes from 1 to m. A copula is introduced as a function that links the joint distribution F(X) to its univariate marginals F(x_{i}). Sklar (1959) proved that for every multivariate distribution F(X) there exists a copula \(C:\left[ {0,1} \right]^{m} \to \left[ {0,1} \right]\) such that:
where F(x_{i})~ U[0,1]. When the marginal distributions are continuous, the multivariate probability density \(f\left( x \right)\) can be expressed in terms of the marginal densities of its comprising variables \(f_{i} \left( {x_{i} } \right)\) and a unique copula density \(c\):
where u_{i} are uniformly distributed random variables. Further information about copulas can be found e.g. in Frees and Valdez (1989), Nelsen (1999), Aas et al. (2009), and Vogl et al. (2012).
In this study, the nonparametric multivariate simulation approach based on the copula concept was applied on the spatially dependent extreme rainfall fields while the events were assumed to be temporally independent. In order to preserve the spatial dependency of the data, we employed the sampling strategy outlined in Lall et al. (2016). The steps are as follows:

1.
Nonparametric logspline density estimation was conducted for each grid cell (i = 1:76) over the extreme rainfall events (j = 1:266) as well as each grid cell’s antecedent rainfall over all events to estimate the marginal distributions.

2.
From each fitted \(f_{i} \left( {x_{i} } \right)\), a random sample (\(x'_{ij}\)) were drawn. The sampling was done with replacement and repeated 100 times (no. of simulation) and we sorted each vector in a matrix \(\left( {x''} \right)\).

3.
An empirical (pseudo) copula was considered. In this case, a copula function was applied on the empirical distribution funcions (Deheuvels 1979) of historical data set \(x_{j}\), j = 1:266. The empirical copula C_{emp} {z_{j}, j = 1:266} was constructed where \(z_{j}\) is a rank matrix.

4.
From rank matrix \(z_{j}\), 266 samples were drawn with replacement (bootstrap) and recorded as \(z_{j}^{\prime}\), j = 1:266. This step was also repeated 100 times (no. of simulations).

5.
Finally, having the sorted matrix \(\left( {x''} \right)\) as well as the matrix of ranks from the empirical copula \((z^{'}_{ij} )\) for each simulation, a simulated matrix was defined using the following equation:
$$w_{ij} = x'^{'}_{i} \left[ {z_{ij}^{\prime} } \right]$$(3)where \(w_{ij}\) is the jth event of the simulated matrix at grid cell i, and \(x'^{'}_{i} \left[ {z_{ij}^{\prime} } \right]\) selects the \(z_{ij}^{\prime}\) th element of \(x'^{'}_{i}\). Figure 4 shows a sample illustration of Lall et al.’s (2016) approach for j = 1:12 extreme events using i = 1:3 grid cells for only one simulation. Variable \(x\) represents the rainfall values (mm/day) for 12 events (no. of rows) among 3 grid cells (no. of columns), \(x'\) represents the sampled data from logspline distribution, and \(x''\) is the sorted matrix of \(x'\) (ascendingly) according to steps 1–2. In steps 3–4, \(z\),which is rank matrix of \(x\), and \(z^{\prime}\), which is the resampled matrix of \(z\), are developed. To develop the simulated matrix, \(w\), we used Eq. 3 introduced in step 5. For instance, the second row of \(w\) is constructed based on the 2nd, 1st, and 2nd largest values of \(x''\)(all shaded in orange in Fig. 4). Another example is given for the 9th row of \(w\) shaded in blue.”
Employing this approach, the simulated fields of extreme rainfall data were obtained from \(w_{ij}\) and used to calculate the runoff and its uncertainty at the WWTPs. The general code and formulations corresponding to this approach is available from Lall et al. (2016).
Sewer system uncertainty analysis
Urban hydrologic models can be classified with respect to spatial and temporal resolution (Fletcher et al. 2013). In the spatial dimension, models can be either lumped or distributed. Lumped models use spatial averages of subcatchments to represent the behavior of the full system (Willems 2001b; Löwe et al. 2014), while distributed models capture all the subcatchment components using a nodelink structure (Elliott and Trowsdale 2007). In the temporal dimension, models can be event based or continuous. Event based analyses are commonly used in the design of infrastructure and simulate the hydrologic response to specifically designed storms (e.g., Delleur 2003), while continuous analyses seek to model system behavior under continuous forcing that includes periods of wet and dry weather. In this study, we used a spatially lumped and temporally eventbased analysis to limit the computational expense and satisfy the temporal independence assumption of the simulation method (Lall et al. 2016).
Runoff is determined primarily by the amount of rainfall, the infiltration characteristics of the land cover, and antecedent rainfall. As explained earlier, the NRCS approach was employed in this study to calculate the runoff as a function of precipitation, the underlying soil’s permeability, land use, and antecedent water content of the soil:
where \(P_{e}\) is the effective rainfall (mm), \(P\) is the depth of rainfall (mm), \(S\) is the potential maximum retention after runoff (mm), and \(I_{a}\) is the initial abstraction (mm). The initial abstraction includes retained surface water as well as evaporated and infiltrated water, and is generally correlated with land cover parameters. As in Eq. 4, runoff cannot begin until the initial abstraction has been met. I_{a} can be approximated by \(I_{a} = 0.2 \times S\) for urban watersheds as per the USDA (Cronshey 1986). S is related to the soil and land cover conditions of the sewershed through the CN:
The curve number methodology is an eventbased approach, thus the effects of antecedent moisture conditions are taken into consideration. The CNs suggested for the normal Antecedent Moisture Condition (AMC_{II}) by NRCS were mapped in Fig. 2b. Depending on the seasonality (dormant or growing season) and total 5day antecedent rainfall, equivalent curve numbers are suggested by NRCS. In the current case study, we assumed only dormant season in NYC and calculated the equivalent curve number according to the antecedent rainfall determined for gauge stations and radar grids. CN in dry conditions (AMC_{I}, 5day antecedent rainfall < 12.7 mm) and wet conditions (AMC_{III}, 5day antecedent rainfall > 27.9 mm) can be computed by:
where I, II, and III represent dry, normal, d wet conditions, respectively. 5day antecedent rainfall less than 27.9 mm (1.1 inch) and greater than 12.7 mm (0.5 inch) was considered normal according to NRCS.
The framework of sewer system uncertainty analysis is summarized in Fig. 5: After preparing data, the extreme rainfall field and corresponding antecedent events derived from the radar data were imported into the simulator. The NRCS approach was then applied on the simulated extreme precipitation events. CN and S were estimated from Eqs. 5–7 and applied to Eq. 4 to determine the effective rainfall at each subgrid cell. We subtracted the catch basin storage volumes per event (Q_{CB}) to compute the excess runoff from each event in each sewershed (\(Q_{s}^{t}\)):
where \(P_{e}^{t}\) is the effective rainfall for event t, A is the subgrid cell area, \(Q_{CB}^{t}\) is the catch basin storage volume at subgrid CB during event t, n is the no. of subgrid cells in each sewershed (N changes for each sewershed, see Table 1), and s is the sewershed index (1:14). The simulated results were required to be verified.
The excess runoff was also calculated using the rain gauge data according to the independent events developed in Sect. 3.1. The goal is to investigate the significance of considering the spatial dependence of the extreme rainfall fields between the grids by comparing \(Q_{s}^{t}\) derived from the simulate extreme rainfall field with \(Q_{s}^{t}\) derived from the spatially independent events (from rain gauges). In calculating runoff corresponding to rain gauge extreme rainfall we picked the same criteria used by NYC DEP (see Fig. 1b). Thus \(P_{e,n}^{t}\) in Eq. 8 is equivalent for n = 1:N according to the calibrated rain gauge station. Finally, we estimated the probability of CSO occurrence during extreme rainfall events, and the runoff change with respect to the land cover distribution and density. This analysis targets the stormwater management plans of the City.
Results and discussion
Rainfall simulation verification
The distributions of simulated radar rainfall data were compared with the observed radar events in order to verify the simulation model. First, the median, standard deviation and 90th percentile of the 266 extreme rainfall events’ rainfall data at each of 76 grids were compared with the corresponding simulated values. Figure 6 shows the simulated versus observation based median, standard deviation and 90th percentile at three sample grids corresponding to the CP, LGA, and JFK station locations (see Fig. 1 for the location). Aside from a small negative bias in the standard deviation of the simulated marginal distributions which is less than 9% (Fig. 6b), the simulated and observed marginal distributions are quite similar.
We also compared the crossstation dependence of extreme rainfall field grids between the simulations and observations by comparing their rank correlation (RC), mutual information (MI), and tail dependence coefficient (TDC) across the grids (Fig. 7). Spearman’s rankorder correlation measures the strength and direction of a monotonic relationship between each pair of grid cells’ extreme rainfall data and can take a range of values from +1 to − 1. A value of 0 indicates that there is no association between the two variables and values greater than 0 indicate a positive association. Figure 7a shows high correlation between the grids at the same WTTP, shaded along the diagonal. Figure 7b also shows that the correlation values of the simulated and observed data are quite similar (between 0.4 and 1) and that the bias is generally between − 1 and 1%.
Mutual Information (MI), introduced by Shannon (1948), is a measure of how similar the crossstation dependence is to the products of factored marginal distributions, i.e. it captures nonlinear dependence (Cover and Thomas 1991). The results are scaled according to the transformation proposed by Joe (1989), which ranges from 0, for complete independence, to 1, for full dependence (large reduction in uncertainty). There is high MI between the grids at the same WTTP (Fig. 7c) and the bias in MI is generally modest (between − 5 and 5%), with the exception of a few high bias grids (up to 30%) (blue stripes in Fig. 7d). Lastly, the Tail Dependence Coefficient (TDC) measures the probability of occurrence of greater than the 90th percentile at one grid given that another grid is also greater than the 90th percentile (Ferreira 2013). Figure 7e shows high TDC between the grids within the areas serviced by the same WTTP. Figure 7f implies that the TDC values of the simulated and observed data are quite similar and that the bias is generally between − 5 and 5%.
Overall, the rank correlation is well simulated by the model (errors are less than 4% for all pairs of grids). This indicates that the model captures the monotonic relationships between sites. The bias in the mutual information between sites is greater (upwards over 5% for many pairs of grids). In particular, the model has a tendency to simulate a weaker nonlinear relationship between pairs of grids when compared to the observations (panels c and d of Fig. 7). There is also relatively large bias in the simulation of tail dependence between grids in some cases, although this bias is not systematic across all grid pairs (bottom panels of Fig. 7), i.e. negative tail dependence bias appears to be approximately as likely as positive tail dependence bias.
Simulated runoff comparison
Figure 8a, b compare the distribution and mean of the simulated runoff with the rain gauge runoff at each WWTP. The results indicate that the values of runoff corresponding to the rain gauge data are significantly higher than the radar simulated runoff at the 95% significance level. Since we considered the extreme rainfall field (76 grids) in the simulation and uniform extreme rainfall in rain gauge runoff calculation, this overestimation of rain gauge runoff was expected. Such overestimation of hazard because of not considering spatial dependence has been reported in other studies (McRobie et al. 2013; Simões et al. 2015). This confirms the significance of considering spatial variation and dependence of extreme rainfall, hypothesized in this paper.
To check whether there is a systematic bias in the extreme radar rainfall data, we evaluated the radar rainfall values at the four grid cells corresponding to the rain gauge locations on days when extreme events occurred, as determined by the rain gauge records. We computed the relative bias in radar data at each rain gauge station during every extreme event. Figure 9 shows that there is only modest bias in the radar rainfall values during rain gauges based extreme rainfall events. The median bias among the events is 0, 5, 0, and 12% at CP, LGA, JFK, and EWR, respectively. This bias is acceptable given that rain gauge rainfall estimates are derived from time integrated pointbased measurements while radar rainfall estimates are derived from spatially integrated and temporally discrete sampled measurements. This error has been noted by other researchers (e.g. Medlin et al. 2007; Villarini et al. 2010; Park et al. 2016).
Sewer system uncertainty analysis results
WWTP uncertainty (endof pipe control)
We estimated the excess runoff at each WWTP and compared the extreme events’ runoff with 2 × DDWF to determine the probability of exceeding the capacity at each WWTP. We accounted for baseflow wastewater [called Dry Weather Flow or DWF (NYCDEP 2012a)] by assuming that the ratio of DWF and DDWF (DWF/DDWF) ranges between 35 and 75% during all events. Thus, if \(Q_{s}^{t}\) is the precipitation generated flow, and we assume that DWF/DDWF is 50%, then the probability of exceeding the flow capacity at each WWTP is \(P(Q_{s}^{t} > (1.5 \times DDWF))\). The boxplots in Fig. 10 represent the probability of the simulated runoff exceeding the plant capacities under various values of DWF/DDWF. For instance, P = 0.02 means that 2% of extreme events exceed the design capacity of the WWTP. Average baseflow (DWF/DDWF) in NYC WWTPs during 2011 is presented in Table 2 (from NYCDEP 2012a). According to this data, the medians of the probability of exceeding the capacity of WWTP were mapped in Fig. 11. Results indicated higher probability of exceeding the capacity (and thus a higher likelihood of CSO) at the JAM—Jamaica, OH—Owls Head, and BB—Bowery Bay WWTPs. Recent DEP construction projects have included upgrades to the wastewater treatment facilities and storm sewer system by expanding the network and constructing large CSO retention tanks to further mitigate the chronic source of pollution. Some of the most recent CSO control systems in the City have been implemented at the BB, JAM, TI, and CI plant outfalls (http://www.nyc.gov/dep). The results of Figs. 10 and 11 can be a useful guide for endofpipe stormwater storage and treatment systems of the City.
Excess runoff prediction (source control)
We also estimated the change in runoff with respect to changes in stormwater capture infrastructure of the City. First, we determined Q_{c} as the contribution (%) of each subgrid in the corresponding sewershed’s total runoff. Then, a nonparametric joint distribution was estimated for the simulated runoff contribution and the corresponding curve numbers weighted by the area of each subgrid cell. The joint distribution of Q_{c} and CN × Area of the subgrids is presented in Fig. 12a. The xaxis is Curve Number weighted by area (CN × Area), the yaxis is runoff contribution Q_{c} (%), and the contours in 2D plot show the probability density function for the joint distribution. This approach may be useful in planning for the urban infrastructure. In NYC, the recent agreement of the City with the New York State Department of Environmental Conservation aims to reduce the CSO through a hybrid Green Infrastructure (GI) and grey infrastructure approach to improve the water quality in NYC’s waterways (http://www.nyc.gov/dep). GI is a source control approach to manage stormwater by detaining or retaining the excess runoff through capture and controlled release by infiltrating the runoff into the ground and increase the vegetative uptake and evapotranspiration. GI, therefore, reduces the need for endofpipe stormwater storage and treatment systems, while poviding additional benefits such as contracting urban heat island effects (Wang et al. 2013). From results in Fig. 12a, we were able to estimate the most effective GI placement from a CSO mitigation perspective. Figure 12b presents the median simulated runoff contribution (Q_{c}) of each subgrid cell at the corresponding sewershed such that the summation of the Q_{c} at each sewershed equals 100. Figure 12 illustrates where the most effective subgrids are for the introduction of new GI if the goal is a 1% reduction of runoff within a certain sewershed by decreasing the CN of the corresponding subgrid by 1%. The values of initial runoff contribution and initial land cover (represented by CN) were taken from Figs. 12b and 2b, respectively. With a comparison of the surface of Fig. 12a for different neighborhoods (subgrid cells), we can find the areas in NYC with higher probabilities of ∆Q_{c} with ∆CN × Area. Figure 12c indicates those neighborhoods (top 20 subgrid cells), mostly located at the JAM—Jamaica, for the assumed criteria. Also, for a specific neighborhood, if is planned to reduce the Q_{c} for a certain amount, with optimization between the new CN and the cost of installing GI, the best plan can be chosen.
Conclusions
A novel framework that utilizes the simulation approach of Lall et al. (2016) was developed in this study to estimate the urban runoff during extreme precipitation events in NYC and compare this runoff to wastewater treatment plant flowrate capacities. The extreme precipitation simulation framework allowed us to simulate the uncertainty of extreme rainfall without neglecting the spatial structure of extreme rainfall data. The main conclusions of this study for NYC are summarized as follows:

1.
According to the current analysis, JAM—Jamaica, OH—Owls Head, and BB—Bowery Bay WWTPs are more prone to CSO under extreme rainfall events (see Figs. 10 and 11).

2.
In NYC, we were able to determine the neighborhoods with the highest effects of installing GI in controlling the excess runoff (see Fig. 12). The results were presented for the top 20 subgrid cells for an example scenario of reducing 1% of runoff within a certain sewershed by decreasing the runoff coefficient of the corresponding subgrid by 1%. However, these locations may change according to the different infrastructure scenarios.
In summary, the main contributions of this study are listed as follows:

1.
The results of Sect. 4.2 confirmed the significance of preserving the spatial dependence of the extreme rainfall field between the grid cells in hydrologic modeling. Specifically, assuming uniform extreme rainfall (based on a rain gauge within a sewershed) can lead to overestimation of runoff.

2.
The uncertainty analysis of WWTPs in Sect. 4.3 provided a guideline approach for endofpipe stormwater management. This paper presented a straightforward strategy for city planners to investigate the effect of infrastructure change on stormwater runoff as a source control system.
References
Aas K, Czado C, Frigessi A, Bakken H (2009) Paircopula constructions of multiple dependence. Insur Math Econ 44(2):182–198
Apipattanavis S, Podestá G, Rajagopalan B, Katz RW (2007) A semiparametric multivariate and multisite weather generator. Water Resour Res 43(11):W11401
Berne A, Delrieu G, Creutin JD, Obled C (2004) Temporal and spatial resolution of rainfall measurements required for urban hydrology. J Hydrol 299:166–179
Bruneau P, Ashkar F, Bobée B (1994) Simplnorm: Un mode`le simple pour obtenir les probabilite´s conjointes de deux de´bits et le niveau qui en de´pend. Can J Civ Eng 5:883–895
Chen L, Singh VP, Shenglian G, Hao Z, Li T (2011) Flood coincidence risk analysis using multivariate copula functions. J Hydrol Eng 17(6):742–755
Cherrier J, Klein Y, Link H, Pillich J, Yonzan N (2016) Hybrid green infrastructure for reducing demands on urban water and energy systems: a New York City hypothetical case study. J Environ Stud Sci 6(1):77–89
Coles S, Bawa J, Trenner L, Dorazio P (2001) An introduction to statistical modeling of extreme values. Springer, London
Cover TM, Thomas JA (1991) Entropy, relative entropy and mutual information. Elem Inf Theory 2:1–55
Cronshey R (1986) Urban hydrology for small watersheds. US Dept. of Agriculture, Soil Conservation Service, Engineering Division, Washington
De Sousa MR, Montalto FA, Spatari S (2012) Using life cycle assessment to evaluate green and grey combined sewer overflow control strategies. J Ind Ecol 16(6):901–913
Deheuvels P (1979) La fonction de dépendance empirique et ses propriétés. Académie Royale de Belgique. Bull Classe Sci 65(5):274–292
Delleur JW (2003) The evolution of urban hydrology: past, present, and future. J Hydraul Eng 129(8):563–573
Elliott AH, Trowsdale SA (2007) A review of models for low impact urban stormwater drainage. Environ Model Softw 22(3):394–405
Environmental Protection Bureau of the New York State Office of the Attorney General (2014) Current & future trends in extreme rainfall across New York State
Ferreira MS (2013) Nonparametric estimation of the taildependence coefficient. RevStat 11(1):1–16
Fletcher TD, Andrieu H, Hamel P (2013) Understanding, management and modelling of urban hydrology and its consequences for receiving waters: a state of the art. Adv Water Resour 51:261–279
Frees EW, Valdez EA (1989) Understanding relationships using copulas. N Am Actuar J 2:1–25
Gires A et al (2015) Impacts of small scale rainfall variability in urban areas: a case study with 1D and 1D/2D hydrological models in a multifractal framework. Urban Water J 12(8):607–617
Gourley JJ, Flamig ZL, Hong Y, Howard KW (2014) Evaluation of past, present and future tools for radarbased flashflood prediction in the USA. Hydrol Sci J 59(7):1377–1389
Hamidi A, Devineni N, Booth JF, Hosten A, Ferraro RR, Khanbilvardi R (2017) Classifying urban rainfall extremes using weather radar data: an application to greater New York area. J Hydrometeorol 18(3):611–623
Harrold TI, Sharma A, Sheather SJ (2003) A nonparametric model for stochastic generation of daily rainfall amounts. Water Resour Res. https://doi.org/10.1029/2003WR002570
Heiss WH, McGrew DL, Sirmans D (1990) NEXRAD: Next Generation Weather Radar (WSR88D). Microw J 33:79–98
Horton RE (1933) The role of infiltration in the hydrologic cycle. EOS Trans Am Geophys Union 14(1):446–460
Horton R, Rosenzweig C, Gornitz V, Bader D, O’Grady M (2010) Climate risk information. Ann N Y Acad Sci 1196(1):147–228
Joe H (1989) Relative entropy measures of multivariate dependence. J Am Stat Assoc 84(405):157–164
Kitzmiller D, Miller D, Fulton R, Ding F (2013) Radar and multisensor precipitation estimation techniques in National Weather Service hydrologic operations. J Hydrol Eng 18:133–142
Lall U, Devineni N, Kaheil Y (2016) An empirical, nonparametric simulator for multivariate random variables with differing marginal densities and nonlinear dependence with hydroclimatic applications. Risk Anal 36(1):57–73
Lang TJ, Nesbitt SW, Carey LD (2009) On the correction of partial beam blockage in polarimetric radar data. J Atmos Ocean Technol 26:943–957
Lin Y, Mitchell KE (2005) The NCEP stage II/IV hourly precipitation analyses: development and applications. In: 19th conference on hydrology, Paper 1.2. American Meteorological Society, San Diego
Löwe R, Mikkelsen PS, Madsen H (2014) Stochastic rainfallrunoff forecasting: parameter estimation, multistep prediction, and evaluation of overflow risk. Stoch Env Res Risk Assess 28(3):505–516
McRobie FH, Wang LP, Onof C, Kenney S (2013) A spatialtemporal rainfall generator for urban drainage design. Water Sci Technol 68(1):240–249
Medlin JM, Kimball SK, Blackwell KG (2007) Radar and rain gauge analysis of the extreme rainfall during Hurricane Danny’s (1997) landfall. Mon Weather Rev 135(5):1869–1888
Mehrotra R, Li J, Westra S, Sharma A (2015) A programming tool to generate multisite daily rainfall using a twostage semi parametric model. Environ Model Softw 63:230–239
Meylan P, Favre AC, Musy A (2012) Predictive hydrology: a frequency analysis approach. CRC Press, Boca Raton
Miller DA, Kitzmiller D, Wu S, Setzenfand R (2010) Radar precipitation estimates in mountainous regions: corrections for partial beam blockage and general radar coverage limitations. In: 24th international conference on hydrology
Morin E, Goodrich DC, Maddox RA, Gao X, Gupta HV, Sorooshian S (2006) Spatial patterns in thunderstorm rainfall events and their coupling with watershed hydrological response. Adv Water Resour 29(6):843–860
Nelsen RB (1999) An Introduction to Copulas. Springer, New York
New York City Department of Environmental Protection, Bureau of Water and Sewer Operations (2009) Sewer design standards, 5 Jan 2009
New York City Department of Environmental Protection, Bureau of Wastewater Treatment (2012a) INFOWORKS citywide recalibration report, June 2012
New York City Department of Environmental Protection (2012b) Guidelines for the design and construction of stormwater management systems. New York City Department of Environmental Protection/New York City Department of Buildings, New York
New York City Department of Environmental Protection (2016) NYC Municipal Separate Storm Sewer System (MS4) 2016 progress report
Notaro V, Fontanazza CM, Freni G, Puleo V (2013) Impact of rainfall data resolution in time and space on the urban flooding evaluation. Water Sci Technol 68(9):1984–1993
Nuswantoro R, Diermanse F, Molkenthin F (2014) Probabilistic flood hazard maps for Jakarta derived from a stochastic rainstorm generator. J Flood Risk Manag 9:105–124
Park T, Lee T, Ahn S, Lee D (2016) Error influence of radar rainfall estimate on rainfallrunoff simulation. Stoch Env Res Risk Assess 30(1):283–292
Pina RD, OchoaRodriguez S, Simões NE, Mijic A, Marques AS, Maksimović Č (2016) Semivs. fullydistributed urban stormwater models: model set up and comparison with two real case studies. Water 8(2):58
Rajagopalan B, Lall U, Tarboton DG, Bowles DS (1997) Multivariate nonparametric resampling scheme for generation of daily weather variables. Stoch Hydrol Hydraul 11(1):65–93
Ramos MH, Creutin JD, Leblois E (2005) Visualization of storm severity. J Hydrol 315:295–307
RaynalVillasenor JA, Salas JD (1987) Multivariate extreme value distributions in hydrological analyses. In: Water for the future: hydrology in perspective, pp 111–119
Rosenzweig C, Major DC, Demong K, Stanton C, Horton R, Stults M (2007) Managing climate change risks in New York City’s water system: assessment and adaptation planning. Mitig Adapt Strat Glob Change 12(8):1391–1409
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423
Simões NE, OchoaRodríguez S, Wang LP, Pina RD, Marques AS, Onof C, Leitão JP (2015) Stochastic urban pluvial flood hazard maps based upon a spatial–temporal rainfall generator. Water 7(7):3396–3406
Singh VP (1997) Effect of spatial and temporal variability in rainfall and watershed characteristics on stream flow hydrograph. Hydrol Process 11(12):1649–1669
Sklar A (1959) Fonctions de répartition à n dimensions et leurs marges. Publ Inst Stat Univ Paris 8:229–231
Smith JA, Baeck ML, Morrison JE, SturdevantRees P, TurnerGillespie DF, Bates PD (2002) The regional hydrology of extreme floods in an urbanizing drainage basin. J Hydrometeorol 3(3):267–282
Smith JA, Baeck ML, Meierdiercks KL, Nelson PA, Miller AJ, Holland EJ (2005) Field studies of the storm event hydrologic response in an urbanizing watershed. Water Resour Res 41:10413
Soil Conservation Service (1985) National engineering handbook. Section 4—Hydrology. Soil Conservation Service, Washington
Spierre SG, Wake C (2010) Trends in extreme precipitation events for the Northeastern United States 1948–2007. University of New Hampshire, Durham
Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, Hoboken
Villarini G, Smith JA, Baeck ML, SturdevantRees P, Krajewski WF (2010) Radar analyses of extreme rainfall and flooding in urban drainage basins. J Hydrol 381(3):266–286
Vivekanandan J, Yates D, Brandes E (1999) The influence of terrain on rainfall estimates from radar reflectivity and specific propagation phase observations. J Atmos Oceanic Technol 16:837–845
Vogl S, Laux P, Qiu W, Mao G, Kunstmann H (2012) Copulabased assimilation of radar and gauge information to derive biascorrected precipitation fields. Hydrol Earth Syst Sci 16(7):2311–2328
Wang R, Eckelman MJ, Zimmerman JB (2013) Consequential environmental and economic life cycle assessment of green and gray stormwater infrastructures for combined sewer systems. Environ Sci Technol 47(19):11189–11198
West M (1992) Modelling with mixtures. Bayesian statistics 4. Oxford University Press, New York, pp 503–525
Wheater HS, Chandler RE, Onof CJ, Isham VS, Bellone E, Yang C, Lekkas D, Lourmas G, Segond ML (2005) Spatialtemporal rainfall modelling for flood risk estimation. Stoch Env Res Risk Assess 19(6):403–416
Willems P (2001a) spatial rainfall generator for small spatial scales. J Hydrol 252:126–144
Willems P (2001b) Stochastic description of the rainfall input errors in lumped hydrological models. Stoch Env Res Risk Assess 15(2):132–152
Wright DB, Smith JA, Villarini G, Baeck ML (2013) Estimating the frequency of extreme rainfall using weather radar and stochastic storm transposition. J Hydrol 488:150–165
Xu Y, Huang G, Fan Y (2017) Multivariate flood risk analysis for Wei River. Stoch Env Res Risk Assess 31(1):225–242
Yang L, Smith JA, Wright DB, Baeck ML, Villarini G, Tian F, Hu H (2013) Urbanization and climate change: an examination of nonstationarities in urban flooding. J Hydrometeorol 14:1791–1809
Zhou Q, Mikkelsen PM, Halsnæs K, ArnbjergNielsen K (2012) Framework for economic pluvial flood risk assessment considering climate change effects and adaptation benefits. J Hydrol 414:539–549
Acknowledgements
This research was supported by NOAA CREST Cooperative Agreement NA11SEC4810004. The statements contained within the manuscript are not the opinions of the funding agency or the U.S. Government but reflect the authors’ opinions. We thank Naresh Devineni from City College of New York for providing valuable feedback during the inception of this work.
Author information
Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Hamidi, A., Farnham, D.J. & Khanbilvardi, R. Uncertainty analysis of urban sewer system using spatial simulation of radar rainfall fields: New York City case study. Stoch Environ Res Risk Assess 32, 2293–2308 (2018). https://doi.org/10.1007/s0047701815638
Published:
Issue Date:
Keywords
 Spatial simulator
 Extreme rainfall
 Radar rainfall data
 NYC sewer system