Abstract
Despite the close linkage between extreme floods and snowmelt, particularly through rain-on-snow (ROS), hydrologic infrastructure is mostly designed based on standard precipitation Intensity-Duration-Frequency curves (PREC-IDF) that neglect snow processes in runoff generation. For snow-dominated regions, such simplification could result in substantial errors in estimating extreme events and infrastructure design risk. To address this long-standing problem, we applied the Next Generation IDF (NG-IDF) technique to estimate design basis extreme events for different durations and return periods in the conterminous United States (CONUS) to distinctly represent the contribution of rain, snowmelt, and ROS events to the amount of water reaching the land surface. A suite of datasets were developed to characterize the magnitude, trend, seasonality, and dominant mechanism of extreme events for over 200,000 locations. Infrastructure design risk associated with the use of PREC-IDF was estimated. Accuracy of the model simulations used in the analyses was confirmed by long-term snow data at over 200 Snowpack Telemetry stations. The presented spatially continuous datasets are readily usable and instrumental for supporting site-specific infrastructure design.
Measurement(s) | gridded precipitation |
Technology Type(s) | weather station |
Sample Characteristic - Environment | flood • snowmelt • hydrological process |
Sample Characteristic - Location | contiguous United States of America |
Similar content being viewed by others
Background & Summary
Although it is well understood in the scientific community that extreme hydrometeorological events in cold climates are often related to snow processes, i.e., snowmelt and rain-on-snow (ROS)1,2,3,4,5,6, they have largely been ignored or under-represented in hydrologic design that relies largely on traditional precipitation-based intensity-duration-frequency curves (PREC-IDF) to estimate design basis extreme events (e.g., 100-year 24-hour event). PREC-IDF such as the National Oceanic and Atmospheric Administration (NOAA) Atlas 147 assumes that precipitation is in the form of rainfall that is immediately available for rainfall-runoff processes. This assumption has obvious shortcomings, especially in snow-dominated regions where winter precipitation is primarily snowfall. At locations where runoff is released slowly from accumulated snowpack, PREC-IDF can lead to infrastructure overdesign and incur unnecessary costs. Conversely, underdesign will occur where the snow-driven runoff rate is higher than the rate of precipitation. This was confirmed by previous research8,9, which demonstrated that PREC-IDF underestimates the 100-year, 24-hour extreme events in 45% of the Snow Telemetry (SNOTEL) stations examined, and the resulting peak design flood could be underestimated by up to 324%.
The Oroville Dam failure in February 2017 that required $1.5 billion to repair is a notable example of costly infrastructure damages resulting from floods driven by ROS events10. The exclusion of snow processes in the PREC-IDF technique is likely to cause greater errors in estimating extremes in a warming climate and present higher infrastructure design risk. For example, increased frequency and intensity of atmospheric rivers in the future are anticipated to cause more extreme orographic precipitation and subsequently increase flood risk along the U.S. West Coast11,12,13,14.
Given the limitations in PREC-IDF and implications for design risk, the Next-Generation IDF (NG-IDF) technique8 was developed to estimate extreme events based on the amount of water reaching the land surface (W) during rain, snowmelt, and ROS events. By including snow processes, NG-IDF provides a systematic and consistent technique for all environments from rain-dominated, transitional, to snow-dominated locations. Despite the marked advantages of NG-IDF over PREC-IDF, its wide adoption by engineers and planners is hindered by rather limited snow observations for estimating W, especially relative to widely available precipitation products15,16,17. Using physics-based models to produce reasonable simulations of W, while feasible, is rather challenging, given the significant requirements of expert knowledge in model calibration and considerable cost of computational resources.
To support the broad adoption of the NG-IDF approach, we developed a suite of datasets that characterize extreme events relevant to hydrologic design over 1951‒2013 at a 1/16th-degree (~6 km) resolution across the conterminous United States (CONUS). For over 200,000 locations in the CONUS, the datasets provide: the magnitude and dominant driving mechanism (i.e., rain, melt, and ROS) of extreme events for different durations (24‒72 hours) and return periods (2‒500 years) derived from the NG-IDF curves; the magnitude of design extreme events associated with different hydrometeorological drivers; trend and seasonality of annual maximum W events (AMW) over 1951‒2013; and infrastructure design risk associated with the PREC-IDF technique. These datasets were developed based on sub-daily simulations of W by a well-validated physics-based hydrological model (Distributed Hydrology Soil Vegetation Model, DHSVM18). To examine the accuracy of the simulations, the simulated snow water equivalent (SWE) was evaluated against long-term observations of SWE from 246 SNOTEL stations distributed across the Western U.S. These datasets intend to offer spatially distributed, quantitative measures of extreme events that are readily usable by the science and engineering communities to understand the climatology and driving mechanism(s) of extreme events, identify potential infrastructure design risk associated with the standard PREC-IDF method, and improve estimation of extremes.
Methods
Overview of NG-IDF datasets development
The method to develop the NG-IDF datasets is illustrated in Fig. 1. In contrast to PREC-IDF that estimates extreme events based on the total amount of precipitation (implicitly assumed to be in the form of rain), NG-IDF curves are developed based on the amount of water available for runoff (W) for the bare ground condition with no canopy cover, which can be represented mathematically by:
where P is precipitation, ΔSWE is the change in ground snowpack water content, and S indicates condensation (positive) or evaporation/sublimation (negative) of snowpack. In this development, as described in more detail in the following sections, P was taken from the gridded, gauge-based meteorological dataset, ΔSWE and S were simulated by the DHSVM snow model. The resulting W data is a 3-hourly time series for 1950–2013 at 207,173 grid cells covering the land surface of the CONUS at a 1/16th-degree resolution.
As observations suggested that snowfall events largely last up to 72 hours19, NG-IDF curves were developed for W events with 24-, 48- and 72-hour durations, respectively. We first developed series of annual maximum W (AMW) for years 1951–2013 based on subdaily estimations of W aggregated over different durations. For example, AMW with a 72-hour duration is the maximum of the 72-hour moving sum of W over a given year. Given variations in precipitation seasonality across the CONUS, we defined AMW events for both water year (October 1st through September 30th) and calendar year (January 1st through December 31st). AMW based on calendar year is more appropriate for locations with extreme precipitation occurring during the summer season such as the Southwest U.S., and AMW based on water year is better for locations with extreme winter precipitation such as the Western U.S. Given the focus on snow-driven extreme events, we used AMW defined by water years for IDF curves.
We applied the nonparametric Mann-Kendall test20,21 on the AMW series. Where trends were significant at the 5% confidence level, we detrended the time series using Sen’s slope22 while maintaining the long-term average of the time series. As suggested by NOAA Atlas 14, the generalized extreme value (GEV) distribution was fitted to AMW across all locations using the L-moments statistics7,8. Here we used the same GEV distribution for analysis of extreme events with different durations across the CONUS so that direct comparisons can be made in frequency estimates across durations or between locations. To quantify the implications of using PREC-IDF for infrastructure design risk, we also developed PREC-IDF curves following the same approach based on annual maximum precipitation (P). The design risk was quantified by comparing the NG-IDF and PREC-IDF values of extreme events (e.g., 100-year 24-hour event) over the CONUS.
A Monte Carlo (MC) simulation procedure23 and NOAA Atlas 147 was used to consider sample data uncertainty in frequency analysis. For each location, we generated 1,000 MC synthetic annual maximum series. We then fitted the GEV distribution to each MC series using the L-moments statistics and estimated the associated NG-IDF values. The uncertainties in NG-IDF curves for each location were quantified using the 5% and 95% quantiles (i.e., the 90% confidence interval) of ensemble members.
Gauge-based gridded meteorological data
For simulations of snow processes, the model requires subdaily input of P, air temperature, relative humidity, downward shortwave and longwave radiation, and wind speed. The meteorological input used here was derived from disaggregating the gridded (~6 km), daily land surface meteorological dataset24 over the CONUS using the Mountain Microclimate Simulation Model (MTCLIM) algorithm25. Distinct from reanalysis products, this dataset is one of the few gauge-based gridded datasets with long-term continuous records. It includes daily records of P, maximum and minimum air temperature, and wind speed spanning the period 1950–2013. The disaggregation algorithms25 estimate subdaily air temperatures with a 3rd-order polynomial fit based on daily temperatures range. Radiation and relative humidity are estimated based on daily temperature range, P, and solar geometry25. P is assumed to occur at a uniform rate throughout the day.
Observational snow data
Daily SWE measurements were retrieved from SNOTEL stations for snow model parameterization and evaluation. Among the 785 SNOTEL stations, we selected 246 stations that shared the longest common period (2007‒2013) of bias-corrected and quality-controlled (BCQC) daily SWE records. Briefly, standard quality control procedures8,26 were applied to remove stations with missing data, outliers, and problematic SWE values (e.g., peak SWE > accumulated winter precipitation). The BCQC procedures and the resulting BCQC datasets27 are available to the public at https://www.pnnl.gov/data-products.
CONUS-scale snow modeling and parameter development
The physics-based, snow submodel of DHSVM18 was implemented at the point scale to simulate snowpack dynamics under a bare ground and flat terrain condition. The model was run at the 3-hourly time step from 1950–2013 for 207,173 point locations at a 1/16th-degree grid spacing that coincides with the center of the meteorological grids. Model output includes 3-hourly times series of SWE and S, which are used together with observed P for calculating W in Eq. 1.
DHSVM simulates ground snowpack accumulation and melt using a two-layer mass and energy balance ground snowpack module. The mass balance components consist of P, S, changes in SWE, and melt from the snowpack. The partition of P into rain and snow is based on air temperature thresholds:
where Ta is air temperature, TS and TR is the temperature threshold for P to be completely snowfall and rain, respectively. If Ta ≥ TS, 100% of the P is rain (R); if Ta < TR, 100% of the P is snow; if Ta falls between the two thresholds, rain and snow are proportionally allocated to represent mixed rain and snow events. Energy balance at the snow surface is driven by net radiation, sensible and latent heat, and advected heat by rain. Energy and mass exchange between the thin surface layer and deep snowpack layer occurs via the exchange of meltwater. When liquid water in the deep snowpack exceeds its holding capacity, excess water is released to the underlying soil column. Detailed descriptions of the DHSVM snow model physics and governing algorithms can be found in a large body of literature18,28,29,30.
Prior calibration of snow models is performed typically at relatively local scales; thus, there are no calibrated, spatial snow parameter sets that can be readily applied for the CONUS-domain snow modeling. In support of this work as well as future large-domain snow modeling, here we developed spatially distributed snow parameters for the CONUS domain. Based on previous research29 that documented the robustness of regionally coherent snow parameters in modeling snowpack dynamics, we developed snow parameters for five spatial clusters covering the CONUS (Fig. 2). Given strong correlations between winter climate and key aspects of snowpack dynamics31,32,33, we determined the clusters using the k-means clustering machine learning technique34 based on the grid-level climatological mean of P, maximum and minimum air temperature, and wind speed from November through March during 1950–2013. Different numbers of clusters were tested and we selected the optimal five clusters based on the inertia elbow method and our previous work29.
Parameter development focused on four snow parameters that were identified by previous work29 to be crucial for capturing daily SWE dynamics29: (1) TS (defined in Eq. 2), (2) fresh snow albedo (amax), (3) albedo decay coefficient during snow accumulation (λA) and (4) snowmelt (λM). The last three parameters are applied in the snow albedo decay curve of Laramie & Schaake35 for estimating snow surface albedo evolution, given by:
where αA and αM are the snow surface albedo during the accumulation and melting seasons, respectively, and d is the number of days since the last snowfall. For other model parameters, we used the default values. The cluster-based snow parameters (Table 1) were developed as follows: (1) we produced prior ensemble parameters, consisting of 10,000 sets of the four snow parameters drawn uniformly from their physically plausible ranges, using the Latin Hypercube Sampling algorithm; (2) snow simulations were conducted at each SNOTEL location for every prior parameter set; (3) the posterior ensemble parameters were resampled from the prior ensemble if they met the threshold values of objective functions with observations. Here, we used three metrics (Eqs. 4‒6): Nash-Sutcliffe Efficiency (NSE) of daily SWE ≥ 0.6, the bias in the mean annual peak SWE (PEAK.ERR) within ± 25%, and the bias in the timing of peak SWE (PDATE.ERR) ≤ 14 days.
where Oi and Yi are the observed and predicted daily SWE at day i, respectively; t is the total number of days for which model simulations were performed; \(\bar{O}\) is the observed mean daily SWE over the simulation period.
where \({O}_{k}^{P}\) and \({Y}_{k}^{P}\) are the observed and predicted peak SWE for the kth water year, respectively; ny is the total number of years of simulation.
where \({Y}_{k}^{D}\) and \({O}_{k}^{D}\) are Julian dates of observed and predicted peak SWE for the kth water year, respectively. After step (3), 58 stations with no qualified posterior parameter values were removed from subsequent analyses. Most of these stations are located in high-latitude areas of Montana and Wyoming, where model skill is challenged by the lack of representation of wind effects on snow redistribution, or the maritime Pacific Northwest where snowmelt tends to be sensitive to errors in modeled energy balance and precipitation partitioning when temperature is near freezing29; (4) for each cluster, final parameter values were calculated as the ensemble mean over the posterior parameter space of all stations within the cluster. For the Southern cluster (C3), no SNOTEL observations are available for parameter development. Because snowfall is very limited for most of the Southern cluster where snow parameters have negligible effects on extreme events, we used the parameter values of the maritime cluster for the Southern cluster given their commonality of warm winter. The cluster parameter values are presented in Table 1.
Driving mechanism of extreme events
For each location, the driving mechanism was determined for extreme W events with different durations and return periods. Table 2 presents the classification approach used for determining the mechanism based on subdaily information of P and SWE, which include:
-
1.
Rainfall only (R): precipitation on snow-free ground;
-
2.
Snowmelt only (M): decreasing SWE with no concurrent precipitation;
-
3.
Rain-on-snow (ROS): decreasing SWE with concurrent precipitation. Given the interest in flood potential, a ROS event is further refined as one with at least 10 mm rainfall per day falling on a snowpack with at least 10 mm SWE over the selected duration, and the sum of rain and snowmelt contains at least 20% of snowmelt36,37,38.
Annual maximum series resulting distinctively from each driving mechanism were determined, based on which IDF curves were developed following the same approach as described for NG-IDF. The mechanism producing the largest IDF value was identified as the dominant mechanism of extreme events.
Seasonality of extreme events
The seasonality of AMW at the 24-, 48- and 72-hour durations was represented by the seasonality index (SI) and the mean date (MD) relative to October 1st due to the use of water year. They were calculated using the circular statistics1,39,40, given by:
where
For a given water year denoted by i, D is the day of the AMF occurrence relative to October 1st (i.e., D = 1 if the event occurred on October 1st); n is the total number of water years used in the analysis. SI, ranging from 0 to 1, measures the temporal variability of the occurrence of events. A smaller SI suggests weaker seasonality, and the associated MD is therefore less reflective of the actual timing of the extreme events.
Data Records
The NG-IDF datasets41 are available to the public through an unrestricted repository at https://doi.org/10.5281/zenodo.5827028 in comma-separated value (.csv) format. Table 3 provides a summary of the folder structures, description of data files, output variables in each file and the format.
Technical Validation
The accuracy of estimated W and all related datasets depends primarily on the accuracy of daily SWE simulations given that P was observational (see Eq. 1). As there exists no data for direct evaluation of NG-IDF curves, we validated SWE simulations against daily SWE observations from 246 SNOTEL stations. Three model performance metrics that compare the simulated and observed SWE were applied: (1) NSE, (2) PEAK.ERR, and (3) PDATE.ERR, which measured the overall goodness-of-fit, peak SWE, and the timing of peak SWE, respectively. Model evaluations (Fig. 2) showed that NSE of daily SWE was greater than 0.6 at 75% of all stations, PEAK.ERR was within ± 25% at 67% of the stations, and PDATE.ERR was within two weeks at 67% of the stations. Overall, the simulations were able to reproduce the observed SWE dynamics at most stations using the cluster-based snow parameters.
Usage Notes
The NG-IDF datasets listed in Table 3, with no additional data analysis, can be used for a wide variety of applications over spatial scales ranging from local, regional to the CONUS scales. Overall, the presented estimates of extreme events and their characteristics based on long-term observational and simulation records are crucial for understanding flood potential, particularly for cold regions where infrastructure design risk exists from using PREC-IDF curves or NOAA Atlas 14.
For hydrologic engineering designs and analyses, one can obtain for any location(s) of interest the magnitude of extreme W events (Fig. 3a) and their dominant mechanism (Fig. 3c). Through comparing the NG-IDF and PREC-IDF values of extreme events for the same locations, one can determine the magnitude of bias or design risk related to the use of PREC-IDF (Fig. 3b). Information related to the flood seasonality is also key for understanding the generating mechanisms of floods and supporting future flood management. For instance, one can obtain the seasonality index (SI) and the mean timing of AMW events for any location of interest. For locations with a stronger seasonality (i.e., a higher SI), there is less inter-annual variability in the timing of AMW occurrence, and thus the mean date represents better the occurrence dates of AMW from year to year. At broader spatial scales, the datasets can be used for prioritizing locations for flood management and adaptation. As shown in Fig. 3, there is substantial spatial variability in the magnitude of 100-year 24-hour W events, which are typically higher in the ROS-dominated Pacific Northwest mountain ranges, and rain-dominated Gulf coastal plains. The dominant mechanism exhibits greater heterogeneity in topographically complex mountainous regions, and there is a shift in the dominant mechanism from ROS to rain or melt for events with a longer duration (72-hour versus 24-hour) (Fig. 3c,d). The presented datasets can also be used for estimating catchment-scale flood responses by coupling the NG-IDF curves with a rainfall-runoff model as demonstrated in prior work16.
Lastly, it should be noted that the datasets are subject to a few limitations:
-
(1)
Diurnal variability is not represented in the precipitation data used here to force the snow model and construct IDF curves. As a result, our estimates of extreme events are limited to daily or longer durations. While the daily temporal resolution is mostly sufficient to capture snow-related extreme events, short-duration IDF curves based on high-resolution rainfall data are more appropriate for capturing short-duration extremes (e.g., flash floods) or floods in small catchments with fast response times (<24 hours).
-
(2)
Although generally smaller compared to sample data uncertainty8, the choice of probability distribution can contribute to uncertainties in estimates of extremes42. Depending on the usage, one can apply the distribution of choice or ensemble distributions for analysis of extreme W events using provided AMW datasets.
-
(3)
Greater uncertainties in NG-IDF estimates are expected for locations with lower snow model skill, such as the maritime Pacific Northwest and locations exposed to high winds during winter.
-
(4)
Stationarity assumption. Based on the Mann-Kendall test, about 10% of the CONUS shows a statistically significant trend in AMW with any duration of 24, 48, or 72 hours. Hence, the IDF curve estimates based on the stationary assumption are valid for about 90% of the CONUS. For the remaining locations showing a significant trend in AMW, we recommend applying a nonstationary approach for constructing NG-IDF curves. Among a variety of nonstationary approaches17,43,44, here we demonstrate the application of the Non-stationary Extreme Value Analysis (NEVA) Software45 to construct the NG-IDF curve based on the 24-hour AMW at a location in the Oregon State, where AMW has the highest positive trend of 1.12 mm/decade over the CONUS (Supplementary Fig. 1). In this particular case, the analysis suggests that the stationary assumption may lead to the underestimation of extreme events into the future. With the provided AMW series, one can develop nonstationary IDF curves using the approach of choice.
-
(5)
Consideration of land cover. The NG-IDF datasets provided in this study are developed for open conditions (as opposed to forest conditions). Given strong forest-snow interactions33 and their implications for streamflows30, incorporating different land cover types into the development of NG-IDF datasets is an undergoing research effort. The datasets presented here are used as the baseline for understanding the vegetation impacts or land use change (e.g., urbanization) impacts on NG-IDF curves.
Code availability
The source code of the DHSVM model used for snow simulations can be freely downloaded at https://github.com/pnnl/DHSVM-PNNL. The R programming language was used for developing IDF curves, detecting trend and determining seasonality of annual maximum series, using the following packages: trend46, lmom47, circular48. Source codes that were used to develop and analyze the data are publicly available at https://github.com/Lizzy0Sun/NG-IDF-analysis-code/.
References
Berghuijs, W. R., Woods, R. A., Hutton, C. J. & Sivapalan, M. Dominant flood generating mechanisms across the United States. Geophys. Res. Lett. 43, 4382–4390 (2016).
Leung, L. R. & Qian, Y. Atmospheric rivers induced heavy precipitation and flooding in the western U.S. simulated by the WRF regional climate model. Geophys. Res. Lett. 36, L03820 (2009).
Li, D., Lettenmaier, D. P., Margulis, S. A. & Andreadis, K. The role of rain‐on‐snow in flooding over the conterminous United States. Water Resour. Res. 55, 8492–8513, https://doi.org/10.1029/2019wr024950 (2019).
McCabe, G. J., Clark, M. P. & Hay, L. E. Rain-on-Snow Events in the Western United States. Bull. Am. Meteorol. Soc. 88, 319–328 (2007).
Musselman, K. N., Clark, M. P., Liu, C., Ikeda, K. & Rasmussen, R. Slower snowmelt in a warmer world. 7, 214–220 (2017).
Ralph, F. M. et al. The Impact of a Prominent Rain Shadow on Flooding in California’s Santa Cruz Mountains: A CALJET Case Study and Sensitivity to the ENSO Cycle. J. Hydrometeorol. 4, 1243–1264 (2003).
Perica, S. et al. Precipitation-Frequency Atlas of the United States, NOAA Atlas 14. (vol. 8, version 2.0, U.S. Dep. of Commer., National Oceanic and Atmospheric Administration, National Weather Service, Silver Spring, Md. (2013).
Yan, H. et al. Next-Generation Intensity-Duration-Frequency Curves for Hydrologic Design in Snow-Dominated Environments. Water Resour. Res. 54, 1093–1108 (2018).
Yan, H. et al. Next-Generation Intensity–Duration–Frequency Curves to Reduce Errors in Peak Flood Design. J. Hydrol. Eng. 24, 04019020 (2019).
NCEI. Billiondollar weather and climate disasters. Available at https://www.ncdc.noaa.gov/billions/events/US/1980-2018 (2018).
Hagos, S. M., Leung, L. R., Yoon, J., Lu, J. & Gao, Y. A projection of changes in landfalling atmospheric river frequency and extreme precipitation over western North America from the Large Ensemble CESM simulations. Geophys. Res. Lett. 43, 1357–1363 (2016).
Warner, M. D., Mass, C. F. & Salathé, E. P. Changes in Winter Atmospheric Rivers along the North American West Coast in CMIP5 Climate Models. J. Hydrometeorol. 16, 118–128 (2015).
Cao, Q. et al. Floods due to Atmospheric Rivers along the U.S. West Coast: The Role of Antecedent Soil Moisture in a Warming Climate. J. Hydrometeorol. 21, 1827–1845 (2020).
Gershunov, A. et al. Precipitation regime change in Western North America: The role of Atmospheric Rivers. Sci. Rep. 9, 9944 (2019).
Hamlet, A. F. New Observed Data Sets for the Validation of Hydrology and Land Surface Models in Cold Climates. Water Resour. Res. 54, 5190–5197, https://doi.org/10.1029/2018WR023123 (2018).
Yan, H. et al. Evaluating next‐generation intensity–duration–frequency curves for design flood estimates in the snow‐dominated western United States. Hydrol. Process. 34, 1255–1268 (2020).
Yan, H., Sun, N., Chen, X. & Wigmosta, M. S. Next-Generation Intensity-Duration-Frequency Curves for Climate-Resilient Infrastructure Design: Advances and Opportunities. Front. Water 2, 545051 (2020).
Wigmosta, M. S., Vail, L. W. & Lettenmaier, D. P. A distributed hydrology-vegetation model for complex terrain. Water Resour. Res. 30, 1665–1679 (1994).
Serreze, M. C., Clark, M. P. & Frei, A. Characteristics of large snowfall events in the montane western United States as examined using snowpack telemetry (SNOTEL) data. Water Resour. Res. 37, 675–688 (2001).
Kendall, M. G. Rank Correlation Methods. (1975).
Mann, H. B. Nonparametric Tests Against Trend. Econometrica 13, 245 (1945).
Sen, P. K. Estimates of the Regression Coefficient Based on Kendall’s Tau. J. Am. Stat. Assoc. 63, 1379–1389 (1968).
Hosking, J. R. M. & Wallis, J. R. Regional Frequency Analysis: An Approach Based on L-Moments. (Cambridge University Press, Cambridge, U. K., 1997).
Livneh, B. et al. A long-term hydrologically based dataset of land surface fluxes and states for the conterminous United States: Update and extensions. J. Clim. 26, 9384–9392 (2013).
Bohn, T. J. et al. Global evaluation of MTCLIM and related algorithms for forcing of ecological and hydrological models. Agric. For. Meteorol. 176, 38–49 (2013).
Serreze, M. C., Clark, M. P., Armstrong, R. L., McGinnis, D. & Pulwarty, R. S. Characteristics of the western United States snowpack from snowpack telemetry(SNOTEL) data. Water Resour. Res. 35, 2145–2160 (1999).
BCQC SNOTEL data https://www.pnnl.gov/data-products (2019).
Storck, P., Bowling, L., Wetherbee, P. & Lettenmaier, D. Application of a GIS-based distributed hydrology model for prediction of forest harvest effects on peak stream flow in the Pacific Northwest. Hydrol. Process. 12, 889–904 (1998).
Sun, N. et al. Regional Snow Parameters Estimation for Large‐Domain Hydrological Applications in the Western United States. J. Geophys. Res. Atmos. 124, 5296–5313 (2019).
Sun, N. et al. Evaluating the functionality and streamflow impacts of explicitly modelling forest-snow interactions and canopy gaps in a distributed hydrologic model. Hydrol. Process. 32, 2128–2140 (2018).
Luce, C. H., Lopez-Burgos, V. & Holden, Z. Sensitivity of snowpack storage to precipitation and temperature using spatial and temporal analog models. Water Resour. Res. 50, 9447–9462 (2014).
Lute, A. C. & Luce, C. H. Are model transferability and complexity antithetical? Insights from validation of a variable-complexity empirical snow model in space and time. Water Resour. Res. 53, 8825–8850 (2017).
Sun, N. et al. Forest Canopy Density Effects on Snowpack across the Climate Gradients of the Western United States Mountain Ranges. Water Resour. Res. 58, e2020WR029194 (2022).
Likas, A., Vlassis, N. & Verbeek, J. J. The global k-means clustering algorithm. Pattern Recognit. 36, 451–461 (2003).
Laramie, R. L. & Schaake, J. C. J. Simulation of the continuous snowmelt process. (1972).
Freudiger, D., Kohn, I., Stahl, K. & Weiler, M. Large-scale analysis of changing frequencies of rain-on-snow events with flood-generation potential. Hydrol. Earth Syst. Sci. 18, 2695–2709 (2014).
Li, D., Lettenmaier, D. P., Margulis, S. A. & Andreadis, K. The Role of Rain-on-Snow in Flooding Over the Conterminous United States. Water Resour. Res. 55, 8492–8513 (2019).
Musselman, K. N. et al. Projected increases and shifts in rain-on-snow flood risk over western North America. Nat. Clim. Chang. 8, 808–812, https://doi.org/10.1038/s41558-018-0236-4 (2018).
Burn, D. H. Catchment similarity for regional flood frequency analysis using seasonality measures. J. Hydrol. 202, 212–230 (1997).
Villarini, G. On the seasonality of flooding across the continental United States. Adv. Water Resour. 87, 80–91 (2016).
Sun, N. et al. CONUS NG-IDF Data Sets. Zenodo https://doi.org/10.5281/zenodo.5827028 (2022).
Yan, H. & Moradkhani, H. Toward more robust extreme flood prediction by Bayesian hierarchical and multimodeling. Nat. Hazards 81, 203–225 (2016).
Ragno, E. et al. Quantifying Changes in Future Intensity‐Duration‐Frequency Curves Using Multimodel Ensemble Simulations. Water Resour. Res. 54, 1751–1764 (2018).
Hou, Z. et al. Incorporating Climate Non-stationarity and Snowmelt Processes in Intensity-Duration-Frequency Analyses with Case Studies in Mountainous Areas. J. Hydrometeorol. 20, 2331–2346 (2019).
Cheng, L. & AghaKouchak, A. Nonstationary Precipitation Intensity-Duration-Frequency Curves for Infrastructure Design in a Changing Climate. Sci. Rep. 4, 7093 (2015).
Pohlert, T. Package ‘trend’. (2016).
Hosking, J. R. M. Package ‘lmom’. (2015).
Agostinelli, C. Package ‘circular’. (2017).
Acknowledgements
This research is supported by the Strategic Environmental Research and Development Program (SERDP) under contract RC‐2546, and the Environmental Security Technology Certification Program (ESTCP) under contract EW21-5140.
Author information
Authors and Affiliations
Contributions
N.S., Y.H. and M.W. designed the general approach of developing the datasets. N.S. performed model simulations, validation, and data analysis. Y.H. prepared model validation datasets and supported data analysis. M.W. advised and managed the projects that fund this research, and revised the manuscript. N.S. wrote the paper. All authors participated in discussions and reviews during the development of this manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sun, N., Yan, H., Wigmosta, M.S. et al. Datasets for characterizing extreme events relevant to hydrologic design over the conterminous United States. Sci Data 9, 154 (2022). https://doi.org/10.1038/s41597-022-01221-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-022-01221-9
- Springer Nature Limited