1 Introduction

1.1 Motivation

Water management in the western United States remains a complex challenge for society as scientists and stakeholders alike face an increasingly delicate balancing act between “feast or famine” precipitation regimes (e.g. Swain et al. 2018). Increasing societal demands for water itself, as well as more robust risk mitigation from water shortages and excesses, further raise the stakes. Accordingly, enhanced planning is required to avoid water shortage disasters, prevent devastating dam-related floods, and minimize the significant economic losses that accompany suboptimal water storage decision-making. Projections of future precipitation states are of critical importance for agencies tasked with water planning and disaster prevention.

Despite decades of research on the topic of western U.S. precipitation in future climates, regional details and confidence in uncertainty estimates remain elusive. The climatology of present-day precipitation in the western U.S. is driven in large part by landfalling atmospheric rivers (ARs; American Meteorological Society 2018) occurring mostly in the cool season (Oct–March) along the U.S. west coast. Across the more arid U. S. Intermountain West, mean and extreme precipitation regimes vary both sub-regionally and across seasons, for example, the climatological spring peak of precipitation across the Montana-Wyoming region versus the southwest U.S. monsoon wet period in the late summer. The complexity in precipitation patterns across the western U. S. is considerable; in the present study we largely focus on cool season (Oct–Mar) precipitation based on its importance to water supply in the western U. S. (e.g., Ralph et al. 2014). While we present most results for the entire western U. S., we also add an additional focus on California as a sub-region that has been both emphasized by a considerable volume of recent research (e.g., Rhoades et al. 2018; Swain et al. 2018; Gershunov et al. 2019) and is of particular interest to key stakeholder groups with specific planning needs at critical water resource structures. Precipitation phase also is of critical importance across the western US, where water supply is often stored in seasonal snowpack. We thus analyze the representation of precipitation type and snow cover processes as well. By analyzing a large suite of diverse model projections over multiple resolutions and both historical and future periods, this study enhances understanding of projections of regional precipitation phenomena of interest across the western U.S.

1.2 Climate change projections for the western U.S.

Climate change may alter many of the processes and phenomena that influence western U.S. precipitation, both in means and extremes. The effect of climate change on ARs in particular has been studied via a number of different approaches and datasets. Global climate model (GCM) studies generally indicate that the impact of ARs on the western United States will increase both in frequency and intensity, which would accordingly lead to increased heavy precipitation (e.g., Dettinger 2011; Gao et al. 2015; Lavers et al. 2015; Warner et al. 2015; Hagos et al. 2016; Tan et al. 2020). While increased temperature and moisture (the so-called “thermodynamic effect”) appears to dominate the climate change impact on AR intensity (e.g., Kossin et al. 2017), landfall location changes are also evident based on how GCMs represent shifts of the subtropical jet and associated storm tracks (e.g., Gao et al. 2015; Shields and Kiehl 2016; Payne et al. 2020). GCM projections for mean annual precipitation across the U. S. Intermountain West show less agreement (e.g., Lukas et al. 2014; USGCRP 2017), yet consistently indicate a likely increase in frequency and intensity of extreme precipitation for most regions (e.g., Kharin et al. 2013; Janssen et al. 2014, 2016).

Increasing projection resolution through the use of regional climate models (RCM) reveals qualitatively similar findings; that is, RCM studies largely corroborate the average changes indicated by global model studies, but impart additional spatial, temporal, and impact-relevant detail which is often desirable for water resources planning. For example, Rhoades et al. (2018) used regional climate simulation data from the North American Coordinated Regional Climate Downscaling Experiment (NA-CORDEX) to demonstrate how key hydrometeorological features influencing western U.S. hydrology, such as snowpack, peak timing, melt rate, and snow season length, collectively indicate a nearly 80% reduction in peak snowpack water volume. Similarly, Salathé et al. (2014) employed a regional dynamical downscaling approach to show that the combination of more extreme storms and warming temperatures (causing precipitation type to shift from snow to rain) increases future flood risk in parts of the Pacific Northwest. A growing body of regional studies further demonstrate that increases in AR intensity and temperature may couple to produce winter precipitation that increasingly falls as rain rather than snow, thereby increasing high-elevation, complex terrain flood risk in particular (e.g., Leung et al. 2004; Leung and Qian 2009; Guan et al. 2016; Mahoney et al. 2013). Across the U. S. Intermountain West, regional climate studies suggest variable change signals. Alexander et al. (2013) examined warm season precipitation over Colorado and surrounding states using the North American Regional Climate Change Assessment Program (NARCCAP) dataset, and found overall drier summers despite an increase in the surface specific humidity, but no clear agreement on the sign of change for the most extreme precipitation. Studies focusing on the cool season across the Intermountain West highlight the northward shift in storm tracks as the main mechanism by which future precipitation climatologies change with latitude (e.g., USGCRP 2017). Finally, snow (and snow water equivalent, SWE), and the length of the season over which it falls and persists as snowpack, is generally projected to decline across the broader western U.S., partially due to more precipitation falling as rain than snow, as well as faster melting of snow on the ground (USGCRP 2017; Rhoades et al. 2018; McCrary and Mearns 2019).

1.3 The North American Coordinated Regional Climate Downscaling Experiment (NA-CORDEX)

The NA-CORDEX experiment aims to add value to the existing body of climate model projections by using multiple resolutions and a matrix of global and regional climate models to facilitate regional climate model intercomparison studies and ultimately serve the impact and adaptation communities (Giorgi et al. 2009; na-cordex.org). As the spatial resolution of RCMs continues to increase, even to convection-permitting resolutions, balancing deterministic or very small ensemble collections with larger, more diverse ensembles remains key to exploring uncertainty; this is an important aspect of selecting the NA-CORDEX dataset for this study (Gutowski et al. 2020). Thus, while traditional GCM ensembles typically provide ~ 100-km grid spacing or more, and convection-permitting ensembles offer high-resolution but limited simulation membership, NA-CORDEX addresses an important need for stakeholders desiring uniform higher-resolution data than can resolve western U.S. terrain and AR phenomena, with enough multi-model diversity and sufficient ensemble membership to assess projection uncertainty. The larger (worldwide) CORDEX effort began as an initiative from the World Climate Research Program (WCRP), coordinating the regional climate modeling efforts to perform climate projections over large, predefined domains. The horizontal resolution of the simulations began with a relatively coarse grid mesh of 0.44° (~ 50-km grid spacing) in order to generate large ensembles of full 100+ year transient simulations; groups with larger computing resources could optionally perform finer resolution simulations to investigate the added value. In the NA-CORDEX framework, most simulations were performed at both 0.44°/50-km grid spacing and 0.22°/25-km grid spacing, with a few modeling centers also simulating at 0.11°/12.5-km grid spacing. In this study, we compare the 50-km and 25-km grid spacing simulations.

A small but growing number of NA-CORDEX studies have begun to examine precipitation and precipitation extremes. Gibson et al. (2019) examined NA-CORDEX historical daily precipitation indices against multiple gridded observational and reanalysis products, emphasizing the non-triviality of observational product differences across the contiguous United States (CONUS), while further summarizing where dynamical downscaling appears to add value, where it may degrade performance, and where model performance is most sensitive to model resolution. Diaconescu et al. (2016) and Whan and Zwiers (2017) focused on a small subset of RCMs driven by different reanalyses and historical GCMs, finding less sensitivity of model performance to the particular driving datasets, and more sensitivity to the region, season, precipitation characteristics, and climate mode indices examined. Lucas-Picher et al. (2017) examined the sensitivity of a single NA-CORDEX RCM to horizontal resolution, highlighting improvements in key simulated processes such as orographic precipitation and local and regional circulations. Rhoades et al. (2018) used NA-CORDEX simulations to expose significant variability in the simulation of snowpack over the headwaters of ten major California reservoirs. The present study aims to complement these analyses and contribute to the larger body of work seeking to understand what can be learned—and specifically, what can be most effectively used by water management decision-makers—from this relatively new collection of RCM projections.

Model datasets such as NA-CORDEX offer appeal to stakeholders because they can, in theory, provide an array of possible future climate states, derived from physically-consistent, spatially and temporally continuous gridded model output that can be used for secondary/application models. These data, by virtue of being produced by dynamical prognostic models as opposed to those based on statistical modeling using historical conditions, also provide physical process insight into how and why specific climate change impacts evolve in particular model projections. While the NA-CORDEX matrix of model simulations may be but one cluster of relatively new data points in a growing sea of climate model guidance, its design and specific objectives render it an important potential resource in understanding the hydroclimate of this water-sensitive region. Herein, we ask: what does the NA-CORDEX model dataset reveal about western U.S. precipitation projections with respect to means, extremes, precipitation type, and its regional and seasonal distribution? How does the NA-CORDEX project offer unique advantages to stakeholders and end users? Does increased resolution add or change confidence based on consistency with existing climate projections? Does the NA-CORDEX dataset advance physical process-based insight with which to better understand the causes of projected changes?

2 Model simulations and observations

The NA-CORDEX model ensemble (Mearns et al. 2017) is composed of 6 regional climate models (RCMs): the CRCM5, RCA4, RegCM4, WRF, CanRCM4, and HIRHAM5 (see Table 1). The individual RCM simulations are driven by either reanalysis [ERA-Interim (ERA-INT)] or one of six global climate models (GCMs): the HadGEM2-ES, CanESM2, MPI-ESM-LR, MPI-ESM-MR, EC-EARTH, and GFDL-ESM2M (Table 1). The RCMs examined in this study were run at resolutions of both 0.44° (~ 50-km grid spacing) or 0.22° (~ 25-km grid spacing). As the overarching purpose of this study is model evaluation, no post-processing (e.g., bias correcting, further statistical downscaling) has been applied, and only simulations having both spatial resolutions (50- or 25-km grid spacing) available are used in the analyses focused on identifying the potential added-value from increased resolution. While model domains are similar across the RCMs, regridding, when necessary, was performed using an inverse distance squared method to a 0.5 × 0.5 common grid. For a detailed description of the individual RCM configurations within the NA-CORDEX ensemble, see: https://na-cordex.org/rcm-characteristics.

Table 1 Summary of daily NA-CORDEX data analyzed from historical and RCP8.5 (future scenario)

For the historical period (1976–2005), precipitation values from the NA-CORDEX simulations are compared to values from two high-resolution precipitation datasets developed by Livneh et al. (2013) and Newman et al. (2015). The Livneh et al. (2013) data are available on a 1/16° latitude–longitude grid over the conterminous United States for the years 1915–2011, and provide an update of the Maurer et al. (2002) dataset derived using daily observations from approximately 20,000 National Oceanic and Atmospheric Administration (NOAA) Cooperative Observer stations. The Newman et al. (2015) daily precipitation dataset is a 100-member ensemble in which gauge data are probabilistically interpolated to a 0.125° resolution grid. Terrain impacts (e.g., elevation and slope) are included, and the ensemble approach is designed to account for uncertainties due to spatial undersampling, measurement projection irregularities, as well as random measurement errors. We also examine snow projections from NA-CORDEX using snow water equivalent (SWE), although we omit the RegCM4 simulations in our analysis of historical and future SWE due to unphysical snow accumulation values; this issue is discussed in more detail in Sect. 3.3. While spatially and temporally continuous historical SWE datasets are limited, here we use the National Operational Hydrologic Remote Sensing Center (NOHRSC) Snow Data Assimilation System (SNODAS) data product for 2004–2018 (NOHRSC 2004; Barrett 2003). All future projection analyses are evaluated over the period 2070–2099.

3 Historical simulations

We begin our analyses by comparing the historical simulations to observations with respect to the spatial and seasonal distribution of precipitation monthly and seasonal means, daily extremes, and precipitation type using the 50-km and 25-km RCMs. We focus on the cool season in the ensuing analysis due to the present-day criticality of October–March months in determining adequate annual water supply for the arid western US, but analyses for all seasons can be easily generated online at: https://www.esrl.noaa.gov/psd/ipcc/cordex/.

3.1 Mean cool-season precipitation

Examining mean cool seasonal (October–March) precipitation across all available NA-CORDEX historical model projections shows that all simulations capture the large-scale orographic precipitation enhancement features in the western U.S. domain (Fig. 1). Magnitudes of mean seasonal precipitation range from 100 mm in lower-elevation locations, to greater than 2500 mm in complex terrain, particularly across the Pacific Northwest. More careful inspection, and subsetting by model resolution and by the largest available shared GCM and RCM subsets, highlights differences (Fig. 2). We focus first on January–February–March (JFM) precipitation to highlight differences associated with the various driving GCMs, RCMs, and model resolutions. With respect to the 50- and 25-km grid spacing simulations, resolution-based differences in precipitation intensity are clear (Fig. 2e, 2f), with the higher resolution (25-km) models generating more precipitation overall, particularly over the Washington and Oregon Cascade mountains and also along the highest terrain of the California Sierra Nevada range. GCM-based dissimilarities are distinct and suggestive of storm track differences particularly in comparing the maxima over the California Sierra Nevada region in the GFDL-forced simulations relative to the Pacific Northwest precipitation maxima found in the HAD-forced historical simulations; the MPI-driven simulations appear different still, a solution resembling a blend of sorts between the HAD- and GFDL- mean precipitation patterns (for more detailed analysis on these differences, see Bukovsky and Mearns 2020 and Hughes et al. 2020). Comparing mean JFM precipitation by available RCMs shows a wetter solution overall in RegCM4 relative to WRF simulations. Systematic latitudinal shifts in the distribution of mean precipitation according to RCM are less clear relative to GCM-based differences, underscoring the primary influence of GCM-driven dynamical controls (e.g., storm track; Chang et al. 2012; Tan et al. 2020) on determining the large-scale spatial distribution of precipitation relative to RCM-scale processes. The spread between individual model members for the JFM historical period (based on the standard deviation in the 30-year means) maximizes over complex terrain, and thus, in general, the climatologically wettest locations (online supplemental Fig. SM1). Historical ensemble standard deviation values exceed 250 mm over California’s Sierra Nevada mountains, evincing the inter-model differences found across Figs. 1 and 2.

Fig. 1
figure 1

Mean historical (1976–2005) cool season (ONDJFM) precipitation (mm, as shaded) for each NA-CORDEX model listed in Table 1, as labeled. a Livneh et al. (2013) reanalysis precipitation (mm); b Newman et al. (2015) ensemble average reanalysis precipitation (mm) for same time period

Fig. 2
figure 2

Historical (1976–2005) mean monthly precipitation (mm) for January, February, and March (JFM). a Newman et al. (2015) mean precipitation estimate; b ensemble mean precipitation for all regional models driven by the GFDL-ESM2M GCM, c as in b except for by the HadGEM2-ES GCM, d as in b except for the MPI-ESM GCMs, e as in b except for all 25-km grid spacing simulations, f as in e except for all 50-km grid spacing simulations, g as in b except for all WRF RCM simulations, and h as in g except for all REGCM4 RCM simulations

We next explore the mean seasonal cycle of precipitation over the western U.S. in observations and historical simulations. As the seasonality of mean precipitation is quite heterogeneous within the western U.S. (e.g., cold season maxima along the West Coast vs. warm season monsoon maxima in the inland portions of the Southwest), seasonality must also be analyzed regionally. A regional analysis of the monthly mean precipitation over California is shown in Fig. 3. The Livneh et al. (2013) and Newman et al. (2015) observational datasets are largely in agreement in terms of amount and temporal (annual) distribution; although such close agreement is not necessarily found in other regions (online supplemental Fig. SM2 for an example over the Upper and Lower Colorado watersheds; see also Gibson et al. 2019; plotting capability for all seasons and larger western U.S. subregion also available through the NOAA PSL CORDEX Precipitation Analysis page (NOAA PSL 2020): https://www.esrl.noaa.gov/psd/ipcc/cordex/). While the historical monthly precipitation climatology of the NA-CORDEX simulations has the correct shape relative to the observational datasets over California (i.e., a seasonal peak between November and March), both the 50-km and 25-km ensemble mean values produce more precipitation relative to either the Newman or Livneh datasets, in some months by as much as 50% (Fig. 3a). Finally, there is considerable spread between individual model members, at times up to a 250% difference between the wettest and driest simulations, and some variability in the peak timing within the November–March period. There does not appear to be a strong systematic difference attributable to model resolution, at least for this California-focused regionally-averaged, monthly time scale (Fig. 3b).

Fig. 3
figure 3

a 50-km (red line) and 25-km (green line) simulation historical (1976–2005) ensemble mean precipitation versus reanalysis precipitation (Newman et al. (2015) in black and Livneh et al. (2013) in blue. Red (green) shaded area shows ± 1 sigma of 50-km (25-km) grid spacing models, and gray shaded area shows Newman et al. (2015) uncertainty bounds containing the full range of the 100 ensemble members. b Mean monthly historical (1976–2005) precipitation (mm/30 days) averaged over the state of California with 50-km (25-km) simulations shown in red (green)

3.2 Extreme daily (99th percentile) precipitation

The representation of extreme precipitation in the historical simulations is examined using the 99th percentile of all JFM daily precipitation values over the historical period (1976–2005). To contextualize the “extremeness” of this measure, the 99th percentile over a three-month period equates to the top 1% of 2700 values, or a roughly once per winter event. Similar to mean monthly precipitation, extreme daily precipitation values maximize across regions of elevated, complex terrain, especially in the California Sierra Nevada (Fig. 4, online supplemental Fig. SM3). Variation due to model resolution is also evident (Fig. 4e, f); for example, daily precipitation maxima exceed 75 mm over the Sierra in the 25-km ensemble average, versus maxima of less than 50 mm in the 50-km ensemble average. Clear disagreement in the spatial distribution of precipitation is found at points farther inland as well (e.g., Idaho, southwestern Arizona and New Mexico, and within the Great Basin itself), and, as found for the historical seasonal mean values, ensemble spread maximizes in these locations (online supplemental Fig. SM1b, d). The first-order spatial distribution patterns in these inland regions of complex terrain appear to be somewhat more strongly influenced by RCM choice rather than by model resolution (cf. Fig. 4e, f, g, h), and the magnitude of daily extremes in these inland locations also increases markedly with increased RCM resolution (Fig. 4e, f).

Fig. 4
figure 4

As in Fig. 2 except for extreme daily (daily 99thpercentile) precipitation

We again use California as a regional focus area to examine NA-CORDEX seasonal representation of historical daily precipitation extremes. Nearly all of the historical model simulations capture the observed DJF peak of daily precipitation extremes over California, with the 25- and 50-km simulation mean values both generally falling within or slightly above the envelope of observational spread from the Newman et al. (2015) datatset (Fig. 5). While the 25-km simulations generate larger extreme daily values on average relative to the 50-km simulations in DJF, the impact of simulation resolution is less obvious through this particular lens of monthly- and spatially- averaged extreme daily precipitation. This resolution-based sensitivity is muted at least in part due to the averaging of enhanced orographically-driven tight spatial precipitation gradients (online supplemental Fig. SM3). California generally obeys a coherent, single-maximum seasonal cycle relative to other parts of the western US; the Upper and Lower Colorado River basins can again be used to demonstrate regional differences in extreme daily precipitation, including multiple maxima across different seasons (online supplemental Fig. SM4).

Fig. 5
figure 5

As in Fig. 3 except for extreme daily (daily 99th percentile) precipitation

3.3 Snowfall, SWE, and precipitation-type

Snow water equivalent (SWE) is evaluated over the western US, where the annual maximum SWE is calculated at each grid box, and its median value over the 30-year historical time period is shown for all physically-viable models (detailed below) in Fig. 6. Median annual maximum SWE shows immense model-to-model variability, with inter-simulation differences ranging upwards of 1000% over the highest peaks of the Sierra Nevada (Fig. 6, online supplemental Fig. SM5). Snowfall as a distinct accumulated precipitation phase was not archived from the NA-CORDEX simulations, but if one estimates snowfall from positive change in daily SWE values (snowfalln = swe(tn) − swe(tn-1) > 0), and assume that loss from snowfall ablation is small, then we can also calculate the percent of total annual precipitation that falls as snow (Fig. 7). This metric removes the effect of differences in annual total precipitation on snowfall across the RCMs compared to simply comparing SWE differences, effectively normalizing changes in SWE by changes in precipitation. Figure 7 demonstrates that the spatial distribution of the snowfall-liquid precipitation ratio varies with RCM and resolution. That is, there are notable spatial distribution variations in the WRF 50-km versus the CRCM5 50-km simulations driven by the same GCM (e.g., MPI-ESM-LR, Fig. 7c, g), and while all snowfall percent maxima anchor around elevated terrain, increased spatial resolution (e.g. the WRF and CanRCM4 simulations) yield more intricate topographic detail and larger maxima values. The spatial differences are consistent with increased resolution affording higher mountains, and thus colder temperatures and higher fractional snowfall. Variability in the monthly distribution of SWE is also quite high among some of the models, with MPI- and CanESM2-forced projections tending to peak later in the season (March–April), and HadGEM-WRF peaking as early as January–February in many locations (online supplemental Fig. SM6).

Fig. 6
figure 6

Historical (2004–2018) median seasonal maximum snow water equivalent (SWE, mm), where a shows SNODAS SWE observational estimate, and bl display all other available RCMs as labeled in upper left. RegCM4 simulations are omitted due to unphysical SWE values as discussed in the text

Fig. 7
figure 7

As in Fig. 6 except for the ratio (%, as shaded at bottom) of annual snowfall to annual total precipitation in historical projections as labeled

Note that we do not include any of the RegCM4 simulations in our analysis of historical or future SWE. We have found that unphysical snow accumulation occurs at select gridpoints in the western U.S. in all of the RegCM4 simulations (online supplemental Fig. SM5; see also McCrary et al. 2020). Snow processes are highly parameterized in land surface models (LSMs) such that small biases in temperature and precipitation combined with the insufficient representation of complexities of snow physics (e.g. thermodynamics, distribution, aging, and albedo) can lead to the runaway accumulation of snow. While some LSMs avoid this problem by constraining SWE to be below a specific threshold (e.g. the Community Land Model; van Kampenhout et al. 2017; Lawrence et al. 2018), the LSM used in RegCM4 does not have this constraint (see McCrary et al. (2017) for details). However, it is beyond the scope of this study to fully diagnose the root cause of this issue. Finally, the interpretation of historical SWE projections is also hampered by a lack of reliable, spatially-distributed observations. This is a well-known and complex issue motivating entire sub-fields of expertise in both climate and snow science, thus here we simply note that the NA-CORDEX models have a large range of snowfall and SWE values as well as projected timing differences that exceed the model spread seen for total precipitation projections.

4 Future projections

4.1 Mean cold-season precipitation

A salient finding within NA-CORDEX projected cool-season mean precipitation changes is a strong sensitivity to the definition of the cool season itself. Starkly different precipitation changes are projected during OND vs. JFM (Fig. 8e, g). For example, ensemble mean precipitation in OND increases over the Pacific Northwest and sharply decreases over the Sierra (Figs. 8, 9, online supplemental Figs. SM7, SM8, SM9); this signal is in contrast with that seen in JFM, in which ensemble mean JFM precipitation increases over Northern and Central CA, and relatively less so over the Pacific Northwest. In California, the OND drying is most prominent in the WRF and RegCM4 simulations forced by the HadGEM2 and MPI-ESM GCMs. The signal is present, but not as large, in other models. The overall precipitation change signal in the Pacific Northwest in OND varies across the individual models, as evidenced by lack of model agreement there and an ensemble spread maximum (Fig. 8e, online supplemental Fig. SM10). There is a roughly even split between positive and negative precipitation change signals determined mainly by the driving GCM, and presumably, storm track differences (Fig. 8, online supplemental Figs. SM7, SM8). In JFM, individual model signals reverse in many simulations: the HadGEM2 and MPI-ESM forced WRF and RegCM4 simulations shift to a drying pattern in the Pacific Northwest and a precipitation increase over California. Conversely the GFDL-forced WRF and RegCM4 simulations show JFM drying over much of California. A notably wetter overall future pattern is seen in the CanESM2-CanRCM4 simulations; the large magnitude of this change likely drives a large part of the JFM ensemble mean signal and is also evident in terms of ensemble spread (online supplemental Figs. SM8, SM10). Possible reasons for these shifts are explored for a subset of NA-CORDEX models in Hughes et al. (2020).

Fig. 8
figure 8

a Historical mean OND precipitation (mm), b future projected OND precipitation c, as in a except for JFM, d as in b except for JFM. e Future-historical projected change (mm) for OND, f as in e except as percent change, g as in e except for JFM, h as in f except for JFM. Black (grey) dot matrix stippling in e and g indicates > 75% of the models agree that the anomalies are positive (negative)

Fig. 9
figure 9

a Future-historical mean projected change (mm, as shaded) in OND mean precipitation for all 25-km simulations; b as in a except for all 50-km simulations, c as in a except for JFM, d as in b except for JFM

To a first order, projected change patterns do not vary drastically between 50-km and 25-km simulations (Fig. 9). Similar to the historical simulation findings, spatial patterns of precipitation change appear to be more strongly influenced by GCM and RCM choice rather than by model resolution, with an overall sharpening of gradients found around areas of complex terrain for the higher resolution simulations (Fig. 9). One area of enhanced sensitivity appears to be in the Pacific Northwest, where, relative to the corresponding 50-km simulation, JFM 25-km simulations show a much more distinct couplet of enhancement and drying to the respective west and east of the Washington and Oregon Cascades. The 25-km CanESM2.CanRCM4, MPI-ESM-LR.RegCM4 and the HadGEM2-ES.RegCM4 models are particularly acute in their strong drying pattern east of the Washington and Oregon Cascades; this signal is opposite to that found in both of the GFDL-ESM2M forced WRF and RegCM4 25-km simulations (online supplemental Figs. SM7, SM8).

With respect to the seasonal cycle, the California-averaged cool season seasonal projection pattern (in which less precipitation is produced in the early fall and early spring, and more precipitation occurs mid-winter) is somewhat more amplified in the 50-km simulations; that is, the range of potential change in the mean monthly precipitation is larger in the 50-km simulations relative to the 25-km simulations (Fig. 10). Given that mean monthly precipitation change projections show spatial sharpening in the 25-km simulations but larger seasonal amplification in the statewide spatially-averaged lower resolution simulations, determining the potential impact or benefit of using 25-km grid spacing relative to 50-km clearly depends on the sub-region and application of interest. For specific stakeholder needs, one might consider that if larger averages in space and time are used (e.g., as inputs to a monthly, coarse-resolution secondary modeling application) and the sign of projected changes is unchanged as one averages, then the impact of increasing from 50-km to 25-km grid spacing may not be of primary importance. However, if consideration of local impact scenarios on finer space and time scales is desired, then the 25-km simulations may offer additional, potentially useful, detail, particularly in areas of complex terrain. Such potential utility is discussed in Sect. 5.

Fig. 10
figure 10

a Seasonal mean monthly future precipitation (mm/30 days) averaged over the state of California for individual 50-km (25-km) simulations shown in red (green) lines; b Future-historical projected changes in mean monthly precipitation (mm/30 days) for 50-km (red) and 25-km (green) simulations. Red (green) shaded area shows ± 1 sigma of 50-km (25-km) grid spacing models for projected future-historical mean monthly change

4.2 Extreme daily (99th percentile) precipitation

The NA-CORDEX ensemble mean of future daily extreme precipitation projections yields a domain-wide general increase in daily extreme precipitation (again using the metric of 99th percentile of daily precipitation, Fig. 11). Absolute increases are most prominent over central to northern California, from the coast to the high terrain of the Sierra, and model agreement on the sign change for these signals is relatively robust, particularly in JFM (Fig. 11c). Percentage changes are more apparent over climatologically drier regions (based on the historical climate), with increases of 15–30% over the Nevada-Utah Great Basin and Upper Missouri River basin (eastern Montana and northeast Wyoming) regions (Fig. 11d). Model spread in terms of the spatial distribution of projected changes is prominent for some sub-regions, however (Fig. 11c; online supplemental Figs. SM10–SM12). Notable exceptions to the ensemble mean-indicated overall increase in extreme precipitation include the EC-EARTH.RCA4 and HadGEM2-ES.RegCM4 (50-km) simulations, which show a decrease in 99th percentile daily precipitation over central-northern California (in the region of Lake Tahoe). The GFDL- and CanESM2-forced models also highlight an area of decreased projected extreme daily precipitation along the US-Mexico border; this is the only region where increases in daily extreme precipitation are not shared by the majority of models (Fig. 11c; online supplemental Figs. SM11, SM12). We further note that the extreme daily precipitation change signals are not as sensitive to the months chosen to define the cool season relative to the sensitivities noted in the mean precipitation change projection analysis (online supplemental Figs. SM11, SM12, SM13).

Fig. 11
figure 11

a Historical (1976–2005) full NA-CORDEX ensemble mean JFM 99th percentile daily precipitation (mm, as shaded); b as in a except for future (2070–2099) simulations; c Future-historical projected changes (mm, as shaded); d as in c except for percent change

From a seasonal cycle perspective, changes in daily extreme precipitation (analyzed at each gridpoint and then averaged) over California largely mirror the changes for monthly mean precipitation, that is, the most robust increases tend to occur in the mid-winter months (DJF) (Fig. 12). Given the nature of California cold season precipitation (i.e., large precipitation totals produced by ARs, modulated by dry periods in between), it is somewhat intuitive that the projected changes in monthly mean and extreme precipitation amounts show the same trends. Model resolution sensitivity is also mixed and likely somewhat muted from this perspective; differences between the 25-km and 50-km simulations are not consistent or obviously meaningful for state-averaged, mean monthly daily extreme values, although the 50-km simulations do produce larger monthly- and California state-averaged daily extreme precipitation across DJF.

Fig. 12
figure 12

As in Fig. 10 except for 99th percentile daily precipitation

4.3 Snowfall, SWE, and precipitation-type

There is a clear consensus across the NA-CORDEX ensemble regarding changes in snowfall and SWE. The models project considerable decreases in future annual maximum SWE and snowfall-total precipitation ratios, both of which are accompanied by earlier dates of maximum seasonal SWE (Figs. 13, 14 and online supplemental Figs. SM14, SM15, SM16). The greatest losses in future projected maximum seasonal SWE occur over the Pacific Northwest, generally during the months of greatest historical accumulation (JFM), tapering as SWE diminishes in the later spring months (AMJ).

Fig. 13
figure 13

a Historical (1976–2005) median max snow water equivalent (SWE, mm) annual evolution (months on x-axis) for each available NA-CORDEX member averaged over the state of California; grid spacing distinguished for 50-km (25-km) by solid (dashed) lines; b as in a except for future (2071–2100) period, c as in a except for future-historical projected change; d ensemble mean change in median max SWE from historical to future period (mm, as shaded)

Fig. 14
figure 14

As in Fig. 7 except for future-historical projected change in the ratio of annual snowfall to annual total precipitation in RCP8.5 projections (Panel a missing because no observations available for future period)

The monthly time series analysis over California (Fig. 13) underscores that the impact of model resolution on SWE is more apparent than for total precipitation; the 25-km simulations systematically show larger seasonal SWE values in both the historical and future simulations relative to their 50-km counterparts (dashed vs. solid lines in Fig. 13a, b), and accordingly possess longer cold-season periods prior to complete melt-out. The fraction of future projected precipitation that falls as snowfall is naturally related to the magnitude of the regional change in temperature, which is largely determined by the driving GCM (Bukovsky and Mearns 2020). For example, in the WRF simulations, the HadGEM-, MPI-, and GFDL- forced simulations have the greatest, moderate, and least snowfall ratio decreases, respectively, which is consistent with the greatest, moderate, and least temperature change (not shown; see Bukovsky and Mearns 2020.)

The decrease in the fraction of precipitation falling as snow in the NA-CORDEX ensemble corroborate the findings of previous studies (McCrary and Mearns 2019; Kapnick and Delworth 2013; Rhoades et al. 2018; Easterling et al., 2017; Klos et al. 2014). This reduction in snowfall is a major contributor to the reduction in SWE shown here, but SWE will also be influenced by increased midseason snowmelt and thus a reduction of the residency time of snowfall on the ground (e.g., Luce et al. 2014). The differences in SWE and snow ratio are also largely related to differences in the surface temperatures and the manner by which various land surface models in the RCMs allow fallen snow to remain on the ground versus melt (McCrary et al. 2017; McCrary and Mearns 2019). Despite the challenges in snowfall and SWE assessment due to limited observations, a relatively small model ensemble (particularly as RegCM4 SWE is not used in this analysis), and significant model spread in the historical simulations, the consistency in the sign and magnitude of projected changes in snowfall in this region still suggest a degree of projection confidence by way of model agreement that may provide value to stakeholders and decision-makers.

5 Conclusions

5.1 Summary

In this study, we evaluate the representation of mean monthly and seasonal precipitation, extreme daily precipitation, and snowfall ratios and snow water equivalent from the NA-CORDEX RCM ensemble across historical and future periods.

NA-CORDEX models generally reproduce the historical observed large-scale orographic precipitation enhancement features across the western U.S., although they tend to overestimate mean seasonal precipitation relative to the observations used here. When evaluating historical mean monthly precipitation, sensitivity to the driving GCM is apparent from latitudinal shifts in monthly precipitation distributions, suggestive of a dependence on GCM-dictated storm track patterns. The higher resolution (25-km grid spacing) models generate more mean monthly precipitation overall, particularly in complex, elevated terrain. Some particular RCM subsets (e.g., RegCM4) produce a notably wetter solution relative to other RCM subsets (e.g., WRF), but the large-scale spatial distribution of monthly mean precipitation is largely determined by the driving GCM. Historical cool-season daily precipitation extremes maximize across regions of elevated, complex terrain along the U. S. West Coast, with greater intensities noted in the higher resolution 25-km simulations.

California is highlighted as a sub-region within the western U.S. over which we assess the seasonal distribution of precipitation. The historical monthly precipitation climatology of the NA-CORDEX simulations has the correct shape of the annual cycle relative to the observational datasets, although both the 50-km and 25-km ensemble mean values produce more precipitation relative to observations, in some months by as much as 50%. There is also considerable spread between individual model members, with no systematic difference clearly attributable to model resolution—although this result stems at least partially from cancelation effects due to spatial and temporal averaging. Historical daily extreme (99th percentile) precipitation over California also peaks in DJF, with the 25- and 50-km simulation 30-year ensemble mean daily values generally falling within or slightly above the envelope of observational spread (Fig. 5); the impact of resolution does not systematically impact the seasonal cycle of California extrema, but sensitivity to resolution may increase for regions farther inland which are controlled more strongly by local processes and local terrain features (e.g., Upper and Lower Colorado River regional analyses are provided in supplemental material, Figs. SM2, 4).

Finally, snowfall-total precipitation ratios and SWE show vast model-to-model variability, with inter-simulation differences ranging upwards of 1000% over high-elevation terrain. As snow integrates the combined biases and uncertainties in both temperature and precipitation, the seasonal variability of SWE accumulation is quite large, with some models peaking in March–April, but others peaking as early as January–February. Model resolution appears to play a more significant role for SWE than for total precipitation; the 25-km simulations tend to have larger seasonal SWE values than their 50-km counterparts, and, in some cases, retain SWE later in the season. However, the evaluation of SWE model output is somewhat limited by a lack of reliable, spatially-distributed observations.

Projected future changes in monthly and seasonal mean precipitation are found to be generally consistent with other recent studies of western U.S. precipitation projections, that is, mixed, regionally-dependent results for seasonal mean changes, and more general agreement for an increase in precipitation associated with extreme events. However, we note sensitivity to a number of factors. First, inter-model variability of future projections can be considerable within the NA-CORDEX ensemble, with the largest and most spatially-sweeping changes again suggestive of storm track differences dictated by the driving GCM, and local-scale precipitation magnitudes and terrain-controlled mesoscale details differing considerably by RCM. Second, the definition of the cool season (e.g., focusing on OND vs. JFM) can completely reverse the apparent sign of projected mean precipitation change. Ensemble mean precipitation in OND increases across much of the Pacific Northwest, but sharply decreases over the California Sierra region (Fig. 8 and SM Figs. 7, 8). This signal is in stark contrast with that of JFM, in which the ensemble mean JFM precipitation increases over northern and central California, and relatively less so over the Pacific Northwest, although there is again considerable variability between individual models.

Regarding future projections of daily extreme precipitation, there is considerable ensemble agreement on the positive sign change (increase) in the intensity of future daily extreme precipitation across nearly the entire western U. S. domain. Seasonally, daily precipitation extremes over California increase most in the mid-winter months (DJF), but a general upward trend is present year-round, demonstrating that the upper bound on, and potential for, flood-inducing precipitation does not decrease even in areas of projected mean drying.

Finally, a marked projected decrease in future seasonal snowfall fraction and SWE is found across all models, accompanied by earlier dates of maximum seasonal SWE accumulation (Fig. 13, and online supplemental Figs. SM14, SM15, SM16). The greatest losses in future projected SWE over California occur during the months of historically greatest accumulation (JFM), with near-zero SWE values projected in the later spring months (AMJ) for many historically snow-covered locations. Such a consistent change signal across the NA-CORDEX ensemble highlights one area of relatively high projection confidence, and prompts additional investigation of flood risk by region and by distinguishing flood-producing mechanisms (e.g., Kundzewicz et al. 2013; Berghuijs et al. 2016; Musselman et al. 2018). NA-CORDEX ensemble agreement also points to enhanced secondary flood risk factors such as more precipitation falling as rain rather than snow, and the changing character of snowmelt (lower totals and earlier spring meltout) in certain locations.

5.2 Interpretation in the context of other climate projection datasets

The results of the NA-CORDEX analysis over the western United States share several common themes and findings with other recent, independent climate projection studies for this region. Placing this study’s results in a broader context can help identify where agreement with other studies may increase confidence in certain aspects of western U.S. precipitation projections, while identifying outstanding areas of uncertainty helps prioritize future research directions.

The projected constriction of California’s cool season precipitation whereby less precipitation is produced state-wide in October, November and early spring, and more precipitation becomes condensed into the mid-winter months aligns closely with Swain et al. (2018)’s analysis of independent climate model projections (i.e., using models distinct from those in NA-CORDEX) from the CESM Large Ensemble (LENS; Kay et al. 2015). Swain et al. (2018) demonstrate overall drying (in monthly mean precipitation) across most latitudes in California for the fall-early winter months, and an overall wetter January and February by the end of the twenty-first century. Other studies such as Dong et al. (2019) also highlight an amplification of the precipitation seasonal cycle along the U.S. West Coast using CMIP5 GCMs. As witnessed by recent extreme wildfire events and periods of drought, decreasing “shoulder season” (fall and spring) precipitation, though lower in overall present-day climatological amounts, is a critical consideration in terms of ending the fall dry season and associated fire risk, and also extending the spring wet season to adequately support agriculture and water supply needs. Therefore, this is an important seasonal detail for planning purposes, and while corroboration across studies may increase confidence in a qualitative sense, significant variation within the NA-CORDEX dataset relative to the other aforementioned study datasets underscores that uncertainties remain. To address nuanced questions such as this, it is key to understand the interplay between large ensembles generated by a single GCM versus multi-model RCM ensembles and the differences in the types of spread each approach generates (e.g., internal variability vs. internal variability combined with fundamental GCM-RCM differences). Comprehensively acknowledging and integrating these different approaches will be critical to designing the most useful future regional climate projections (e.g., Gutowski et al. 2020).

Considerable agreement was found between the present study and with those using statistical climate model downscaling methods. Such analyses (e.g., Gutmann et al. 2012; Pierce et al. 2013) find similar trends in California projected precipitation in mean monthly precipitation statistics, as well as increasing variability in extremes and a constriction of the mid-winter, heavy precipitation-producing season. In contrast to statistical methods, dynamical downscaling (i.e., NA-CORDEX) has the ability to simulate spatial shifts and redistributions in precipitation with physical fidelity, and the ability to capture critical synoptic-scale dynamical weather and climate features such as the North American monsoon (Pierce et al. 2013).

Additional findings with relatively robust agreement and thus strong potential stakeholder implications include the significant decrease in snowfall and SWE by 2100, along with the fairly systematic projected increase in western U.S. daily extreme precipitation intensity. That this latter effect occurs relative to more modest and mixed changes in seasonal mean precipitation also corroborates the results of Swain et al. 2018, reinforcing that uncertainty in mean seasonal precipitation changes does not necessarily decrease confidence in projected changes in precipitation extremes.

Finally, the sensitivity of the NA-CORDEX RCM projections to resolution also concurs with recent related regional climate modeling studies. For the metrics examined here (mean monthly precipitation, daily extreme precipitation, and SWE), over the western U.S. during the cold season, the impact of increasing model resolution from ~ 50-km grid spacing to ~ 25-km grid spacing does not appear to drastically alter diagnostics such as monthly-scale precipitation climatology, but is relatively more important for daily precipitation extrema and snowfall. This is perhaps not terribly surprising given how a 50-km vs. a 25-km model grid box represents strong synoptically-forced precipitation (i.e., a blend of parameterized and explicit precipitation), and how such processes average out over monthly and multi-year averages. However, for this particular western U. S. region, there are examples of terrain-controlled precipitation patterns, precipitation type (snow vs. rain) and moisture transport features which should be, and in some cases clearly are, impacted by model resolution. Related RCM studies have suggested resolutions around ~ 12.5-km grid spacing better reproduce mean and extreme precipitation for almost all regions and seasons, citing that this resolution is needed to most effectively capitalize on the improved representation of orography (e.g., Prein et al. 2016, Lucas-Pincher et al. 2017), but that it may yet be insufficient for critical hydrologic applications (e.g., Castaneda-Gonzalez et al. 2019; He et al. 2019; Smiatek and Kunstmann 2019; Xu et al. 2019). The results of this study support the general notion that ~ 50-km grid spacing is sufficient for resolving regional-scale effects resulting from large-scale precipitation systems that characterize the climate of many locations in the western United States, but that smaller-scale physical processes critical for determining extreme precipitation, as well as land-surface processes controlling snow-dominated regions likely require finer grid spacing.

5.3 Considerations for potential users and stakeholders

NA-CORDEX precipitation projections add confidence to certain aspects of the state of knowledge concerning the future of western U.S. precipitation, and also highlight outstanding areas of uncertainty. How can end users harness both confidence and uncertainty information to optimally use NA-CORDEX to guide water resources management? We offer the following considerations for potential users and stakeholders:

  • An increase in the magnitude of cool-season, western U.S. daily extreme (99th percentile) precipitation is a consistent finding that can be useful in both scenario planning and to inform inputs for secondary application models.

  • Projected changes in seasonality (e.g., constriction of the wet season in California) provide a cautionary example of where broader-brush seasonal and/or ensemble averages might lead an end user astray. The significant intra-seasonal shifts in change projections underscore the sensitivity to the months chosen, and more broadly remind that choices such as the models, thresholds, and specific weather event types chosen over which to derive an average change signal can matter greatly (e.g., Prein et al. 2019).

  • Ensemble spread can be wielded beneficially: though certain NA-CORDEX projection metrics possess a relatively large degree of spread, apparent model disagreement need not be interpreted as a lack of skill, particularly in regions where climate change signal-to-noise (e.g., internal variability) might be modest or difficult to discern (e.g., Tebaldi et al. 2011; Deser et al. 2012). For example, large ensembles of simulations with the same model and greenhouse gas forcing indicate wide ranges in the precipitation response over the western US, even for 30-year averages (e.g., Deser et al. 2014; NOAA PSL (2020) Climate Change Web Portal). In fact, studies such as Karmalkar (2018) suggest that a lack of sufficient spread is more damaging to end user applications. For example, large intra-model differences in mean monthly precipitation exist over key subregions such as the Pacific Northwest, central Idaho, and northern California if one considers all the individual solutions (e.g., SM7, SM8, SM10). When addressing objectives for which consideration of the full suite of model climate futures is desirable, stakeholders may find value in considering each model member as an internally-physically-consistent, plausible future climate state.

  • The full potential of model datasets such as NA-CORDEX is not realized in a pursuit of identifying a “most skillful” model, and in this study we thus emphasize understanding and harnessing the ensemble spread versus emphasizing model skill relative to historical observational evaluations. The rationale for this is well supported by recent research, i.e., (1) large differences in observational data can disproportionately determine what is deemed to be “skill,” particularly in data-sparse regions such as the western U.S. (e.g., Gibson et al. 2019; Gampe et al. 2019); (2), the process of defining skill in RCM projections is a moving target, a function of the metrics and regions chosen, and possesses a strong potential to get the “right answers” for the wrong (physical process-based) reasons (e.g., Mahoney et al. 2013; Bukovsky et al. 2013; Thibeault and Seth 2015; Fan et al. 2015); and (3) the concept of weighting models within an ensemble to produce a superior regional climate “blended” projection has been demonstrated to come with many caveats and potential disadvantages, and in the end contributes yet another source of uncertainty (Christensen et al. 2010; Knutti et al. 2010; Weigel et al. 2010; Bukovsky et al. 2019).

  • End users should plan for sufficient time and expertise to query the physical fidelity of model data and be prepared to investigate the possibility of model output curiosities. One example of an unphysical SWE feedback in the RegCM4 has been documented here, demonstrating that extracting data for use in sub-regional planning or secondary application models without first establishing bigger-picture context for the model output could easily lead one astray of using the best available data.

  • It is a challenging but important undertaking to optimally combine models, methods, and diagnostics in ways that can produce a representative and relevant story for a specific application. Considering ensemble means alongside extreme member solutions while using ensemble spread in meaningful ways (e.g., Tebaldi et al. 2011) can provide value-added input to secondary application models, enabling so-called ‘‘storytelling’’ frameworks (e.g., Hazeleger et al. 2015; Shepherd 2016) that help define well-founded multiple futures for scenario planning (e.g., Star et al. 2016).

There remain many avenues of potential future work to better understand the NA-CORDEX model dataset in particular as well as the outstanding uncertainties in western U.S. precipitation projections. Future efforts to investigate other seasons and sub-regions in greater depth would benefit the research and regional climate modeling communities, as well as stakeholders and end users. As the enduring aphorism goes, “All models are wrong, but some are useful” (Box and Draper 1987). Great challenge and opportunity exist in both model advancement and optimizing the use potential of imperfect and inherently uncertain model guidance.