1 Introduction

Although California has substantially improved its air quality over the past decades with effective emission control strategies, it still remains as one of the most polluted states in the United States (Lurmann et al. 2015; State of the Air Report 2019). It is well known that complex topography and meteorological conditions, such as mesoscale flow patterns and the presence of stagnant meteorological conditions, coupled with high anthropogenic emissions are the main contributors to the poor air quality in California (Niccum et al. 1995; Tanrikulu et al. 2000; Zhong et al. 2004). For instance, many high particulate matter (PM) episodes in California’s San Joaquin Valley (SJV) are largely due to stagnation associated with meteorological drivers such as low planetary boundary layer height (PBLH) and low wind speed, which are sometimes induced by the intrusion of Pacific Subtropical High (PSH) located west of California to SJV (Zhao et al. 2011a). Moreover, pollutants tend to accumulate within the SJV under these conditions, because transport is obstructed and the mountain ranges surrounding the valley help to trap the pollution within the valley.

California has experienced more weather extremes in recent years (Liu et al. 2018), such as the exceptional 2012–2016 drought resulting from extremely low precipitation and record-high temperature (Griffin and Anchukaitis 2014; AghaKouchak et al. 2015; Diffenbaugh et al. 2015), followed by significant flooding in 2017. Some of the weather extremes, such as droughts and heat waves, can exacerbate air pollution episodes (Zhang et al. 2017; Wang et al. 2017) and exert severe impacts on human health (increase of morbidity and mortality and losses of work productivity), wildfires, agriculture production, and ecosystem productivity (Dong et al. 2014, 2015). Heat waves can also cause stress for livestock and wildlife, enhance the spread of pests, as well as increased energy demand for cooling (Beniston et al. 2007). In 2017, the occurrence of two giant wildfires (Thomas and Wine Counties) may be attributed to unhealthy forests (building-up of fuels) and extreme high temperature in California.

It is indisputable that climate change is occurring (Forster et al. 2007; IPCC 2012; Cook et al. 2016) and that weather extremum will continue to occur in the future. It is also probable that weather extremum will become even more pronounced under a changing climate (Meehl et al. 2000; Coumou and Rahmstorf 2012; Swain et al. 2018). Thus, there is a need to better understand how meteorological conditions, especially extreme weather events, may change on a regional scale as the climate changes in the future. In addition, to ensure that regulatory control programs aimed at improving air quality in California will remain effective, it is important to better understand how changes in future meteorological conditions may affect air quality.

There are rather different meteorological and air quality concerns for summertime and wintertime in California. During summertime, when ozone is the primary pollutant of concern, air temperature and ventilation are crucial factors affecting ozone production and buildup (Sillman 2000). Both the frequency and magnitude of heat waves have been shown to increase in California during summer, which are at least partially induced by climate change (Gershunov et al. 2009; Meehl and Tebaldi 2004). In contrast, the delta breeze is an important summertime coastal phenomenon, which may bring cool and moist marine air to the SJV and the Sacramento Valley through the San Francisco Bay delta. The delta breezes can have important impacts on air quality in the Central Valley (CV): first, they efficiently transport air pollutants from the Bay Area into the region; second, they dilute air pollution in the CV due to the stronger winds associated with the breezes (Bao et al. 2008; Niccum et al. 1995); and third, the consequent cooler temperature affects the ozone production over the region.

In wintertime, particular matter (PM) is the primary pollutant of concern due to meteorology that is conducive to the formation of ammonium nitrate particulates as well as directly emitted particulate matter in the form of organic carbon from residential wood burning and under-fire charbroiling (i.e., meat cooking at restaurants; Chen et al. 2009a, 2010). As Turkiewicz et al. (2006) pointed out, high PM concentrations in SJV are episodic: buildups under stagnation conditions and dissipation as weather systems move though the valley breaking up stagnation (either by wash out or ventilation). Given the seasonal variation of both PM and ozone, as well as the differences in the underlying meteorological drivers influencing these two types of pollution, we need to evaluate different meteorological parameters and phenomena in summer [e.g., temperature extremes and marine air penetration (MAP)] versus winter (precipitation characteristics and stagnation events) when we analyze climate change impacts on meteorological conditions relevant to air quality in California.

Global climate model (GCM) simulations have been commonly used to study future changes in climate (e.g., Cox et al. 2000; Meehl et al. 2007). However, because of the complex terrain and intricate flow patterns in California (e.g., land-sea breeze, mountain-valley wind, low-level jets), it is impossible to accurately infer future meteorology over this region directly from GCM results, which normally have rather coarse grid sizes. For example, the horizontal spatial grid size of models used in the Coupled Model Intercomparison Project Phase 5 (CMIP5; Taylor et al. 2012) range from 0.5° to 4° and about half of the models have an average grid size coarser than 1.3°. This coarse grid size, while sufficient for global climate studies, is not sufficient to resolve local/regional scale meteorology in many areas. To overcome the issue of large grid size in regional climate change studies, a regional climate model (RCM) is commonly employed to dynamically downscale GCM results to a finer grid size in order to improve the representation of the regional meteorology (Castro et al. 2005; Knutson et al. 2013; Zhao et al. 2011b).

Previous dynamical downscaling studies have shown that GCM results can have non-negligible biases, which are inevitably passed to the RCM simulations through the initial and lateral boundary conditions (Zhao et al. 2011b; Leung and Ghan 1999; Bell et al. 2004; Duffy et al. 2006). Recent studies have applied bias correction techniques to GCM data prior to dynamical downscaling in order to obtain a more accurate regional climate projection. Two statistical bias correction methods are commonly used: linear bias correction and quantile bias correction. Both methods correct towards reanalysis data, which are assumed to be non-biased. The linear bias correction uses a linear regression equation or a mean bias (i.e., zero slope) to calibrate climatological mean biases of the GCM data (Holland et al. 2010; Bruyère et al. 2014; Meyer and Jin 2017). The quantile bias correction usually requires the development of a statistical function, which is then applied to calibrate the GCM data. For instance, Colette et al. (2012) matched the cumulative distribution function (CDF) from the GCM data onto the distribution of the reanalysis data to correct the GCM bias; and Jin et al. (2011) developed a bilinear regression function based on GCM and reanalysis data and then applied the resulting function to calibrate the GCM climatological bias. White and Toumi (2013) compared the linear bias correction and the quantile bias correction method, and reported that although both methods improved the downscaled monthly climatology, the latter method has a major limitation of introducing spurious variations to the GCM fields that would result in additional variability in the RCM downscaled variables and affect the domain interior. They also pointed out that both methods have problems simulating extrema accurately. In this work, an improved linear bias correction method was used in which both the climatological mean and the inter-annual variations of the GCM are adjusted based on reanalysis data prior to the RCM downscaling. Xu and Yang (2012) reported that this bias correction method results in great improvements in downscaled temperature and precipitation extremes.

Previous dynamical downscaling studies have explored mid-twenty-first-century climate projections over the US based on different Intergovernmental Panel on Climate Change (IPCC) scenarios. For instance, Avise et al. (2009) and Chen et al. (2009b) employed the Fifth-Generation Penn State/NCAR Mesoscale Model (MM5) to dynamically downscale Parallel Climate Model (PCM) data to a 36 × 36 km2 grid over the continental U.S. Their results suggest an approximately 1 K increase in summertime daily maximum temperature for most of California by the 2050 s, as well as an up to 400 m decrease in summer mean daily maximum PBLH (\(PBLH_{max}\)) for most of the Northern and Central California and an up to 600 m decrease in \(PBLH_{max}\) over the arid region of Southeastern California. Similarly, Leung and Gustafson (2005) utilized MM5 to downscale GCM simulations (from the Goddard Institute of Space Studies model (GISS)), but projected increases in summertime surface temperature over California that ranged from 1 to 3 K by the 2050s. Pierce et al. (2012) used 3 RCMs to dynamical downscale 16 GCM model results from CMIP3 Climate Projections to ~ 12 × 12 km2 grid and explored the potential changes in temperature and precipitation over California by the 2060s. The downscaled results presented in Pierce et al. (2012) showed smaller temperature increases (ranging from 1.5 to 2.5 K) for the coastal area over Western California and a greater rise in temperature (ranging from 2 to 3.2 K) over the mountainous and/or arid regions of Eastern California.

The primary goal of this study is to explore the future projection of air pollution-relevant meteorological variables (e.g., temperature, wind, precipitation and ventilation) and important atmospheric phenomena (e.g., heat wave events, MAP events and droughts) over California in greater detail using the aforementioned dynamical downscaling with bias correction (i.e., the mean bias correction plus the adjustment of inter-annual variations). We use 2003–2012 for the current climate period and 2046–2055 for the future climate period. For each decadal period, output from the Community Earth System Model (CESM) CMIP 5 runs at ~ 1°×1° grid was dynamically downscaled to a 4 × 4 km2 grid over California using the WRF regional model. California is geographically diverse with a long coastline to the west, a long and narrow valley in the middle of the state surrounded by mountain ranges such as the Sierra Nevada and Coast Ranges, and the arid Mojave Desert in the southeast of the state. The influence of climate change can vary significantly for different geographical regions over California. Thus, in this study, California is divided into several geographical sub-regions and the climate projections over these sub-regions are discussed separately using the 4-km dynamical downscaled results. As mentioned earlier, the pollution-relevant meteorological conditions in California are rather different between summer and winter. In addition to evaluating the dynamical downscaling method with bias correction applied to the CESM data, this paper also presents the climate change impact on summertime meteorological conditions and atmospheric phenomena. The analysis of the meteorological conditions and atmospheric events in wintertime will be discussed in a separate paper. Note that due to the large computational cost associated with the WRF downscaling simulations, WRF downscaling experiments are only conducted driven by CESM results from CMIP5 under one emission scenario in this study. The results from this study create a potential future scenario in a high resolution.

This paper is organized as follows. Section 2 describes the GCM and RCM used in this study, the dynamical downscaling methodology and the bias correction technique, as well as model configuration. Section 3 evaluates dynamical downscaling results with the bias correction method. Section 4 presents the summertime climate change impacts on different sub-regions over California and the impact on heat wave and MAP events. Concluding remarks are given in Sect. 5.

2 Models and methodology

2.1 Model description

CESM is a fully-coupled global climate model developed by the National Center for Atmospheric Research (NCAR) and other institutes. The Community Atmosphere Model Version 4 (CAM4; Neale 2011), the Community Land Model Version 4 (CLM4; Lawrence et al. 2011), the Parallel Ocean Program Version 2 (POP2; Smith 2010), and the Community Sea-Ice Model Version 4 (CSIM4; Hunke and Lipscomb 2008) are the atmospheric, land, oceanic and sea-ice components of CESM, respectively. For CMIP5, CAM4 is configured with a nominal horizontal grid of 0.9° × 1.25° and 26 vertical levels. Four representative concentration pathways (RCPs) scenarios have been applied to future CMIP5 climate simulations, including a “low emissions” (RCP2.6) scenario, “medium–low emissions” (RCP4.5) scenario, “medium–high emissions” (RCP6.0) scenario, and “high emissions” (RCP8.5) scenario (Moss et al. 2010). The CESM CMIP5 results are from an ensemble of climate simulations, which include six ensemble members of the twentieth-century climate (1850–2005) and six ensemble members for each of the four RCPs for the future projections, where the ensemble is used to reduce model uncertainties (Peacock 2012). The CESM simulation results based on the RCP6.0 scenario were used in this study. This emission scenario reflects local climate solutions similar to those applied in California towards economic, social, and environmental sustainability with intermediate population increase and intermediate economic development. The CESM data used in this study is downloaded from the Research Data Archive (RDA) at NCAR (data set 316.0; Research Data Archive 2011), which is from ensemble member #6 of the CESM CMIP5 runs, also known as the Mother of All Runs [MOAR; b40.rcp6_0.1 deg.006 (RCP6.0)]. This is the only CESM ensemble member under RCP6.0 that has the full fields needed to drive WRF model at 6-hourly intervals (Bruyère et al. 2015). The climate model simulations used in this study are described in Meehl et al. (2012) and Peacock (2012) in great detail.

The RCM used in this study is the WRF model with the Advanced Research WRF (ARW) dynamic core version 3.5.1 (Skamarock et al. 2008). WRF-ARW is a non-hydrostatic model suitable for a broad range of applications across different scales. The WRF model was chosen in the study since it has been widely used to simulate different weather phenomena over California (Jankov et al. 2009; Leung and Qian 2009; Lu et al. 2012) and to dynamically downscale GCM results for different applications (Caldwell et al. 2009; Zhao et al. 2011a, b).

2.2 Methodology for dynamical downscaling with bias corrections

NARR reanalysis (Mesinger et al. 2006) for the 2003–2012 period from the National Centers for Environmental Prediction (NCEP) was used as a benchmark to bias correct the CESM results. NARR data, which is available at 3-h intervals, covers North America with a 32 × 32 km2 horizontal grid and 29 vertical levels. Note that in addition to different horizontal grid sizes, CESM and NARR data use different vertical coordinates, in which the CESM data are on the hybrid sigma-pressure levels, while NARR data are on pressure levels. Thus, 3D variables from NARR are first horizontally averaged to CESM grids, and then vertically interpolated to CESM levels during the bias correction process.

The bias correction method developed by Xu and Yang (2012), which not only corrects the climatological means but also the interannual variations of the GCM data, was utilized in this study to correct the biases in the CESM output. The bias correction is tuned for each calendar month and applied on a 6-hourly basis. The interannual variations, which represent the variability from year-to-year for a given month, are adjusted by scaling the variations of the CESM data to those of the NARR data for the same month. In other words, we multiply the variations of the 6-hourly CESM data by the ratio of the standard deviation of NARR data to that of the CESM data for 2003–2012 by month. The same ratio is applied to adjust future CESM variations. The procedures for the bias correction and variation adjustment of 6-hourly CESM data for the present (\(CESM_{P,m}^{*}\)) and future (\(CESM_{F,m}^{*}\)) climate data are shown in Eqs. (1) and (2), respectively.

$$CESM_{P,m}^{*} = \overline{{NARR_{P,m} }} + CESM_{P,m}^{'} \times \frac{{ \left. {\sigma_{NARR} } \right|_{P,m} }}{{\left. {\sigma_{CESM} } \right|_{P,m} }}$$
$$CESM_{F,m}^{*} =\, \overline{{NARR_{P,m} }} + \left( {\overline{{CESM_{F,m} }} - \overline{{CESM_{P,m} }} } \right) + CESM_{F,m}^{'} \times \frac{{\left. {\sigma_{NARR} } \right|_{P,m} }}{{\left. {\sigma_{CESM} } \right|_{P,m} }}$$

The subscripts P and F in (1) and (2) indicate the present and future climate period, respectively, while the subscript m refers to the calendar month. \(\overline{NARR}\) and \(\overline{CESM}\) represent the 10-year climatological means of the 6-hourly NARR and CESM data for a given grid cell and calendar month. \(CESM^{'}\) is the anomaly of CESM data from its 10-year monthly climatological mean for that hour. Lastly, \(\left. {\sigma_{NARR} } \right|_{P,m}\) and \(\left. {\sigma_{CESM} } \right|_{P,m}\) are the monthly standard deviations of the NARR and CESM climate data, respectively, for the present 10-year period. This bias correction method has been shown to be superior to the traditional linear bias correction method in retaining the interannual variability of the GCM data, capturing downscaled weather extreme events and improving probability distributions of meteorological variables (Xu and Yang 2012, 2015). More detailed description of the method can be found in Xu and Yang (2012). Following Xu and Yang (2012), bias correction was applied to three-dimensional (3D) air temperature, zonal and meridional wind, water vapor mixing ratio, and geopotential height at each model grid and on each vertical level for each 6-hourly CESM output. Different from Xu and Yang (2012), bias correction was also applied to air temperature and relative humidity at two meters above ground level (AGL), as well as skin temperature, which are crucial parameters used as initial inputs for WRF through the dynamical downscaling process, in this study.

Figure 1 shows the probability distributions of 2-m air temperature (T2) and relative humidity (RH2) for the grid points encompassing Los Angeles and Fresno from the original CESM data, NARR data horizontally averaged to CESM grids, bias-corrected CESM data with only the climatological mean corrected (referred to as “bc_m”), and bias-corrected CESM data with both the climatological mean and interannual variation corrected (referred to as “bc_mv”) for the summer months (June, July and August) of 2003–2012 at 00 UTC [4 pm Pacific Standard Time (PST)], respectively. It is illustrated that the original CESM data has a rather large warm and dry (cold and wet) bias for the grid encompassing Los Angeles (Fresno). Although the probability distribution of T2 and RH2 from the bias-corrected CESM match these from NARR data better than those from the original CESM data, regardless of which bias correction method is used, it is obvious that the “bc_mv” bias correction method is superior to the “bc_m” method at reproducing the probability distribution of these variables and over these two regions. For instance, Fig. 1a illustrates that the mode values (peak of the probability distribution) for T2 in Los Angeles from the original CESM data and the averaged NARR data are 307 K and 302 K, while the corresponding probabilities at the mode values are 11.5% and 15.8%, respectively. Both the mode value (303 K) and its corresponding probability (14.5%) of the T2 probability distribution from bias-corrected CESM data using the “bc_mv” method are comparable to the counterparts from averaged NARR data for the data point encompassing Los Angles; whereas, the T2 probability distribution from bias-corrected CESM data with the “bc_m” method has two modes centered at 300 K and 303 K with much smaller probability (11.5%) than the counterpart from the averaged NARR data. Similar improvements were observed for other parameters and at other locations shown in Fig. 1. Another important characteristic illustrated in Fig. 1 is that the two tails of the probability distribution in each panel from the bias-corrected CEMS data with the “bc_mv” method resemble those from averaged NARR data much better than those from the CESM data with the “bc_m” bias correction method, implying that the “bc_mv” method is a better choice than the “bc_m” method for capturing weather extremes (drought, extreme hot events, etc.). Note that only data from 00 UTC are used in Fig. 1, so variability from the diurnal cycle is not contained in these plots. The spatial distributions of July mean surface and 850 hPa temperature averaged for 2003–2012 from the NARR reanalysis and the original CESM (Fig. S1) illustrate that the original CESM data has a warm bias in both T2 and 850 hPa temperature for most areas over the land and ocean. All data in Fig. S1 is gridded on to a 36 × 36 km2 horizontal grid using the WRF Preprocessing System (WPS) for easier comparison. In California, T2 is clearly over-predicted over the Sierra Nevada mountain range and over Southern California. The hot area over southern California and southwestern Arizona is much larger in the original CESM data than that in the NARR data. Under-prediction of T2 in the original CESM data is observed for the southern portion of the CV, which is consistent with what is shown in Fig. 1b. After bias correction with the “bc_mv” method, all of the biases noted above are substantially reduced. Over California, cooler surface temperature over the Sierra Nevada mountain range is evident after the bias correction of the CESM data, and the warm area over southern California has an amplitude and pattern comparable to that of the NARR data after bias correction. It is clear from Fig. S1 that the overall patterns of T2 and 850 hPa temperature from the bias-corrected CESM data resemble those from the NARR reanalysis reasonably well.

Fig. 1
figure 1

Probability distribution of T2 (in K; top) and RH2 (in %; bottom) for grids encompassing Los Angeles (left) and Fresno (right) from original CESM data (gray), NARR data averaged to CESM grid (black), bias-corrected CESM data with both the climatological mean and interannual variation corrected (blue), and CESM data with only climatological mean corrected (red) for summer months of 2003–2012 at 00 UTC

The GCM bias can be time-dependent. Thus, it is possible that the bias correction coefficients derived from the training period (2003–2012) may not work properly for another period (Reifen and Toumi 2009). To examine whether the coefficients derived from the training period using the “bc_mv” method can be applied to adjust the CESM bias for other periods, these coefficients are applied to CESM data for two validation periods (1980–1989 and 1990–1999) based on Eq. (2) and the result is evaluated here. These two validation periods are chosen because NARR data starts from 1979. Figures S2 and S3 show monthly mean T2 at 00 UTC (4 pm PST) from the original CESM, bias-corrected CESM and NARR data for July averaged over the validation periods. All data are plotted on the original grids of the NARR and CESM data. Compared to the NARR data, the original CESM data have a broader coverage of warm temperature over southern California, southwestern Arizona and the northern border of Mexico (Figs. S2 and S3). Moreover, the temperature gradient between the warmer CV and the cooler Sierra Nevada is smaller in the original CESM data (Figs. S2a and S3a) compared to those from the NARR data (Figs. S2c and S3c). These discrepancies between the original CESM and NARR data are reduced greatly after performing bias correction on the CESM data (Figs. S2b and S3b). Furthermore, the overall over-prediction of T2 over the ocean from the original CESM data is also reduced substantially. Reductions in discrepancies between bias-corrected CESM and NARR data are observed for other variables and other calendar months over these two validation periods (figures not shown here). The improvement in the agreements between the CESM and NARR data for the two validation periods after applying the bias correction coefficients, which were derived from the training data over a different time period (2003–2012), illustrates the time-invariance of the biases of the CESM data relative to NARR data, and which indicates that the coefficients are stable and can be applied to other time periods. Therefore, these bias correction coefficients are applied to adjust the biases of the CESM data for the future simulation period (2046–2055) in this study.

2.3 Model configurations

The WRF model is configured with three nested domains with horizontal grid sizes of 36 × 36, 12 × 12, and 4 × 4 km2, and 31 vertical layers (Fig. 2a). Previous studies (Liu et al. 2012; Xu and Yang 2015) reported that spectral nudging (Waldron et al. 1996) is more suitable than grid nudging in restraining dynamically downscaled WRF simulations, due to the fact that spectral nudging reduces the drift of WRF simulations from the large-scale pattern of GCMs and at the same time retains the small-scale features generated by WRF. In this study, spectral nudging is applied to temperature, horizontal winds, and geopotential height above the planetary boundary layer (PBL) for the outermost domain, which is about 3300 km in both zonal and meridional directions. The maximum wave number 3 was chosen for spectral nudging in zonal and meridional directions, which captures a wavelength of 1100 km and longer in the driving CESM data. The nudging coefficients were set to 3 × 10−4 s−1 as suggested by Xu and Yang (2015). Because of the use of spectral nudging and relatively weak nudging coefficients, monthly WRF simulations were divided into two roughly 2-week simulations, where the first simulation was initialized on the second to last day of the prior month and continued through the 15th of the current month. The second simulation was initialized on the 14th of each month and continued through the end of the month. The first 2 days of each simulation were used as model spin-up. For instance, the 1st simulation for July 2003 started at 00 UTC June 29 and ended at 00 UTC July 16; while the 2nd simulation for that month began at 00 UTC July 14 and ended at 00 UTC August 1. The physics schemes and domain configuration utilized in this study follow those that have been used for regulatory WRF modeling activities at the California Air Resources Board, which were chosen based on the combination of parameterization schemes that resulted in superior model performance over California (results not shown here). Major physical schemes include WRF Single-Moment 6-class microphysics (Hong and Lim 2006), RRTM longwave radiation (Mlawer et al. 1997), Dudhia short wave radiation (Dudhia 1989), Pleim–Xiu land-surface parameterization (Pleim and Xiu 1995), Yonsei University PBL parameterization (Hong et al. 2006), and Kain–Fritsch convective parameterization (Kain 2004) for the outer 2 domains. The initial conditions for the three domains and lateral boundary conditions for the outermost domain were provided by the bias-corrected 6-hourly CESM data through the dynamical downscaling process. Two-way nesting was applied between the parent and the nested domains.

Fig. 2
figure 2

a WRF 3 nesting domains; b geophysical sub-regions with detailed analysis; The abbreviation of BA, GBV, MC, MD, NC, SAC, SC, SCC, and SJV stand for Bay Area, Great Basin Valley, Mountain Counties, Mojave Desert, North Coast, Sacramento Valley, South Coast, South Central Coast, and San Joaquin Valley, respectively

In addition to WRF downscaling simulations driven by bias-corrected CESM data, WRF simulations were also conducted driven by NARR data and the original CESM data for the present 10-year period. These simulations were used to evaluate the influence of bias correction on the dynamically downscaled results. WRF simulations driven by the original CESM data were only conducted on the outer 2 domains to save computational time and storage.

3 Evaluation of the dynamical downscaling results with the bias correction method

As demonstrated in Sect. 2, the bias-corrected CESM data is more comparable to NARR reanalysis data relative to the original CESM data. The bias corrected CESM data will likely provide more reliable initial and boundary conditions for the subsequent WRF downscaling simulations, which will be assessed in this section. Our assessment is based on the comparison of the two sets of WRF downscaled results driven by the original CESM data (D_OCESM) and by the bias-corrected CESM data (D_BCCESM) against those driven by NARR reanalysis fields (D_NARR) for the summer of the current 10-year simulation period. Surface observations are also utilized to further validate the WRF downscaled results. As mentioned in Sect. 2.3, D_OCESM uses only the outer 2 domains shown in Fig. 2a, therefore, the comparisons made in this section are based on WRF results from the 2nd domain with 12 × 12 km2 horizontal resolution.

California has highly complex terrain, and climate change projections can vary remarkably in different geographical regions. Our fine-grid dynamically downscaled results are proper for exploring climate change projections for sub-regions with complex terrain. Therefore, in addition to a statewide analysis, the evaluation of dynamically downscaled results for the present climate (this section), as well as the investigation of the future climate projections (subsequent sections), will also focus on the nine geographical sub-regions shown in Fig. 2b (see figure for the names of sub-regions and their acronyms). The nine sub-regions represent different climate and terrain conditions in California. NC, BA, SCC and SC represent different coastal oceanic climate regions and among them, BA and SC include two major urban centers (i.e., San Francisco for BA and Los Angeles for SC). SJV and SAC are California’s two largest agricultural regions and the climate of these two regions is largely affected by MAP events and mountain-valley wind flow patterns. GBV and MD (MC) are selected in order to assess climate change in California’s deserts (mountains).

The summer mean surface temperature in D_OCESM is warmer than that in D_NARR (up to 3 K) over the coast of California, Sierra Nevada mountain range, and Pacific Ocean off the coast of California (Fig. 3a), which are reduced substantially after applying bias correction to the CESM data (Fig. 3b). The T2 over-prediction for other states in the Western US in D_OCESM (Fig. 3a) is also reduced remarkably in D_BCCESM (Fig. 3b). Figure 4a illustrates that except for three sub-regions (MC, SAC and SJV), where the regional mean differences in T2 (relative to D_NARR) remain almost unchanged, the regional mean differences for the other six sub-regions from D_BCCESM are smaller than D_OCESM, especially for coastal regions (SC, SCC and NC) and GBV. The substantially smaller variations of T2 differences within each region for D_BCCESM compared to D_OCESM for 7 out of 9 sub-regions (Fig. 4a), together with overall smaller regional mean differences, indicate an improved resemblance of the simulated spatial patterns of T2 between D_BCCESM and D_NARR than for D_OCESM.

Fig. 3
figure 3

Differences of summer mean T2 (in K; top), RH2 (in %; middle), and 10-m wind speed (in m/s; bottom) from D_OCESM (left) and D_BCCESM (right) relative to D_NARR

Fig. 4
figure 4

Box–whisker plots for statistics of difference in a T2 (in K), b RH2 (in %), and c 10-m wind speed (ws10; in m/s) from D_OCESM (red) and D_BCCESM (blue) relative to D_NARR in 9 sub-regions. The box represents the values between the first and third quartile. The whiskers represent the distances between the lower extreme (5%) to the first quartile and the fourth quartile to the upper extreme (95%). The lines in the middle of the boxes represents the regional mean values

D_OCESM simulated RH2 is higher than D_NARR’s (up to 10%) for almost the entire California and Nevada, but is remarkably lower than D_NARR’s (up to − 10%) for most of Arizona (Fig. 3c). The RH2 discrepancies between D_OCESM and D_NARR over California and Nevada are reduced considerably after the bias correction of the CESM data (Fig. 3d). The large under-prediction of RH2 over Arizona in D_OCESM (Fig. 3c) becomes a small over-prediction in D_BCCESM (Fig. 3d). However, compared to D_NARR, the under-prediction of RH2 over western Oregon and Washington in D_OCESN is more pronounced, and it is extended further south to the northern coast of California in D_BCCESM. Table S1 and Fig. 4b show that compared to D_NARR, D_OCESM over-predicts RH2 for all 9 sub-regions with the largest regional mean difference of + 5.27% over BA. Driven by bias-corrected CESM data, D_BCCESM over (under) predicts RH2 for 5 (4) out of 9 sub-regions; and the magnitude of the difference is reduced more than 50% for SAC, BA, MC, SJV and SC. Moreover, a reduction in regional variations of differences in RH2 from D_BCCESM compared to D_OCESM is observed in 7 out of 9 regions (Fig. 4b), suggesting better agreement in the simulated spatial pattern of RH2 between D_BCCEMS and D_NARR than that between D_OCESM and D_NARR. Surface temperature and moisture from WRF simulations are greatly influenced by soil variables. Differences in T2 and RH2 between D_BCCESM and D_NARR remain relatively large after the bias correction (Figs. 3, 4). This is in part due to bias correction not being applied to the soil temperature and soil moisture fields, because the difference in the temporal resolution of the soil variables from CESM (monthly) and NARR (6-hourly) would likely introduce additional uncertainties, which are difficult to quantify, if bias correction was applied.

Figure 3e, f depict an overall improvement in surface wind prediction over California in D_BCCESM compared to D_OCESM; however, the positive ws10 difference between D_OCESM and D_NARR off the northern coast of California and Oregon is amplified after bias correction. The summertime regional mean difference in ws10 (Fig. 4c and Table S1) is improved for sub-regions with positive differences between D_OCESM and D_NARR; whereas, the magnitudes of the negative differences between D_OCESM and D_NARR become slightly bigger after the bias correction of CESM data. These changes in area-mean ws10 indicate that D_BCCESM tends to simulate slightly weaker winds over California compared to D_OCESM. Overall, Fig. 4c and Table S1 show that as a result of the bias correction of the CESM data, the regional mean ws10 from the downscaled WRF simulations is improved for 6 out of 9 sub-regions, and the regional variations of differences in ws10 are smaller from D_BCCESM compared to D_OCESM for all 9 sub-regions.

Comparisons are also conducted between the WRF downscaled results (D_OCESM and D_BCCESM) and surface observations from the Federal Aviation Administration (FAA) and the US EPA Aerometric Information Retrieval System (AIRS) for SJV and SC to further evaluate model performance with and without CESM bias correction (Fig. S4). The SJV and SC were chosen because they are the two regions in California with the worst air pollution, so they have been heavily studied and will remain a focus of future studies. The numbers of stations used in this comparison are 32 (28) stations for T2 over SJV (SC); 12 (11) stations for RH2 over SJV (SC); and 32 (38) stations for ws10 over SJV (SC), respectively Figure S4 shows that the T2 mean biases are reduced substantially from 1.7 K in D_OCESM to − 0.1 K in D_BCCESM over SC and from − 0.6 K in D_OCESM to − 0.5 K in D_BCCESM over SJV. The RH2 positive (negative) mean bias over SJV (SC) from D_OCESM is also reduced when the CESM data is bias corrected. For ws10, the mean bias decreases from 0.43 m/s in D_OCESM to 0.15 m/s in D_BCCESM over SC; however, the mean ws10 bias in D_OCESM increases from − 0.08 m/s to − 0.26 m/s in D_BCCESM over SJV. Figures 4 and S4 illustrate that the trends in model biases from D_OCESM and D_BCCESM are consistent compared to both D_NARR results and observations over these two sub-regions, implying that it is reasonable to use D_NARR as a benchmark to evaluate the performance of D_OCESM and D_BCCESM. The comparisons to both D_NARR and surface observations suggest overall improvements in WRF downscaled surface results over California as a result of the bias correction of the CESM data.

Figure 5 shows the difference in vertical profile of summertime mean temperature, geopotential height, wind speed, and water vapor mixing ratio (Qv) from both D_OCESM and D_BCCESM relative to D_NARR. Note that the mean values are calculated using WRF domain 2 (12 × 12 km2 grid) results but averaged over the region covered by domain 3 (Fig. 2a). Compared to D_NARR, D_OCESM results exhibit a positive difference up to 2 K below ~ 500 hPa and a negative difference up to − 3 K above ~ 500 hPa. After applying bias correction to the CESM data (i.e., D_BCCESM), the positive difference below 500 hPa was reduced by more than 50%, especially for the lowest 2 layers close to the surface where the differences were reduced from 1.72 K and 1.98 K to 0.28 K and 0.82 K, respectively. The large negative biases above 500 hPa were reduced remarkably and the magnitudes of the differences were smaller than 0.4 K. The over-predictions of Qv from D_OCESM were reduced for almost all layers after using bias-corrected CESM data (i.e., D_BCCESM), especially for the lowest two layers. Compared to D_OCESM, the differences in wind speeds below 800 hPa become slightly larger from D_BCCESM, which is likely due to the increase in the positive difference in the wind speed off the northern coast of California from D_BCCESM (Fig. 3e, f). However, the under-prediction of wind speeds for layers above 800 hPa from D_OCESM were reduced appreciably. Smaller differences were also obtained for geopotential height after using the bias-corrected CESM data (i.e., D_BCCESM). Figure S5 illustrates that the positive (negative) differences in 500 hPa temperature over the northern (southern) part of the domain in D_OCESM were greatly diminished in D_BCCESM. The over-predictions of 500 hPa geopotential height over the entire domain in D_OCESM were also greatly reduced in D_BCCESM, especially for the northern part of the domain.

Fig. 5
figure 5

Summer mean difference of a air temperature, b geopotential height, c wind speed, and d water vapor mixing ratio from D_BCCESM and D_OCESM comparing to those from D_NARR. Note the mean differences are calculated based on results from domain 2 but over the same area of domain 3 shown in Fig. 2a

The evaluation in this section demonstrates that the disagreements between D_OCESM and D_NARR for both the near-surface and upper-air variables were improved considerably when the biases of the original CESM data were corrected (i.e., D_BCCESM) prior to the dynamical downscaling process.

4 Summertime climate change projections over California

In this section, the impact of climate change on California’s summertime meteorological variables that are crucial to summer ozone formation and buildup, such as temperature, water vapor mixing ratio, wind speed, and PBLH, is investigated using the WRF downscaled results. Model outputs from the innermost domain (4 × 4 km2 grid) are used for this evaluation. The climate change influence on heat waves, which have a significant impact on morbidity and mortality in humans and livestock, as well as energy resources in California is investigated in this section. Furthermore, the climate change impact on MAP events which bring cool and moist marine air to the CV of California is also discussed.

4.1 Summertime overall future climate changes over California

The differences between the future and present summer means of T2, 2-m water vapor mixing ratio (Qv2), \(PBLH_{max}\) and ws10 are illustrated in Fig. 6. A p value analysis is conducted to evaluate the significance of the future change in these variables compared to their inter-annual variations (Zhao et al. 2011a). The projected future change in the variable of interest is considered statistically significant when the p value is smaller than 0.05 (95% confidence) in this study. Areas with statistically significant future changes are shown in Fig. 6. A future increase in summertime daily mean T2 is projected throughout the entire domain (Fig. 6a). Over California, GBV and MD are the two regions that have a greater than 1 K increase in regional mean T2, while SC, SAC, MC, SJV and SCC exceed a 0.9 K increase in regional mean T2 (Table 1). An approximate 1 K T2 increase is also predicted over the Pacific Ocean within the analysis domain. These temperature increases are statistically significant at the 95% confidence level relative to the inter-annual variation over the entire domain except for the northern coast of California (Fig. 6a). The predicted pattern and magnitude of the temperature increase in this study are consistent with those shown in previous studies (Liang et al. 2006; Avise et al. 2009; Chen et al. 2009b; Trail et al. 2013). Prior studies, such as Easterling et al. (1997), Karl et al. (1993), and Vose et al. (2005) reported a faster increase of nighttime temperature than daytime temperature, resulting in a smaller daily temperature range (DTR). Our results also show a greater nighttime T2 increase than daytime increase over most inland areas of the domain (Fig. S6). More than 0.2 K reductions in the regional mean DTRs were observed in several sub-regions, including MC, SAC, NC, SJV and GBV (Table S2). As pointed out in previous studies (Easterling et al. 1997; Karl et al. 1993), the future decrease in DTR may be a consequence of a combination of several factors, such as changes in cloud cover and shortwave radiation. The investigation of the underlining physics mechanisms of the observed DTR decrease is beyond the scope of this study.

Fig. 6
figure 6

Future changes in a daily mean T2 (K); b daily mean Qv2 (g/kg); c PBLH (m) at 2 pm PST; and d daily mean10-m wind speed (m/s) averaged over the summer months. Regions with 95% confidence level (p < 0.05) of the future changes in these variables are shaded with cross-hatching

Table 1 Future change of summertime daily mean 2-m T and 2-m Qv; daily maximum PBLH; daily mean 10-m wind speed averaged for the nine geophysical sub-regions

In addition to climate change impacts to the seasonal mean T2, changes throughout the entire distribution of T2 are also studied here. For example, Fig. 7 depicts the probability distribution for statewide-mean daily maximum (Tmax), daily minimum (Tmin) and daily mean (Tmean) T2 for the present and future decades. Figure 7 shows that compared to the present decade, the means of the Tmax, Tmin, and Tmean are shifted to warmer temperatures by 0.8 K, 1.1 K and 1.0 K in the future decade, respectively. Consistent with IPCC (2012) and Donat and Alexander (2012), a tendency towards less negative skewness in the future climate is found for all three parameters (Fig. 7), suggesting changes in asymmetry toward the hotter part of the distribution in addition to the shifts in the means of the three variables. The 5th and 95th percentiles of Tmax (Tmin) are projected to increase by 0.7 K and 1.1 K (1.9 K and 1.1 K), respectively. For Tmin, the shift in the 95th percentile is roughly equal to the shift in the mean, but for Tmax, the shift in 95th percentile is approximately 30% greater than the shift in the mean. In contrast, the shift in the 5th percentile of Tmin is approximately 75% greater than the shift in the mean, while for Tmax, the shift in the 5th percentile is roughly 12% lower than the shift in the mean. The changes in the variance are more heterogeneous. A reduction of variance in the future decade is observed in Tmin and Tmean; whereas, an increase is observed in Tmax; implying a narrower (wider) probability distribution for Tmin and Tmean (Tmax) in the future decade. Greater future changes in Tmin compared to Tmax are seen in most statistical parameters shown in Fig. 7, indicating a greater climate change impact on nighttime temperature than daytime temperature. The increase in positive skewness seen in Fig. 7 is indicative of more frequent hot extreme events in the future.

Fig. 7
figure 7

Probability distribution for present (2003–2012; light blue) and future period (2046–2055; light red) of state-mean a daily maximum, b daily minimum, and c daily mean T2 with 100 bins. Statistics (mean, variance, skewness, 5th and 95th percentiles) for each dataset are shown in each panel

Historical observations and numerical studies reported that lower-tropospheric water vapor will increase as the climate warms (Ingram 2002; Santer et al. 2007; Held and Soden 2006). The downscaled results from this study predict future increase in Qv2 over California and Pacific Ocean (Fig. 6b). The increase is overall greater over the ocean than inland. The greatest increase of Qv2 inland is predicted along the coast (BA, NC, SCC and SC); and a 0.33 g/kg future increase in regional mean Qv2 is also projected over SAC (Fig. 6b and Table 1). In addition to the Pacific Ocean, over which the Qv2 change is significant, the increases in Qv2 are also statistically significant over the coastal regions of California, SAC and part of SJV (Fig. 6b).

Surface temperature is a crucial factor for PBLH, since higher surface temperature normally leads to stronger turbulence and vertical mixing, and subsequently resulting in higher PBLH. Therefore, it is not surprising to see projected future increases in \(PBLH_{max}\) over most regions within the domain (Fig. 6c). Here it is assumed that \(PBLH_{max}\) occurs at 2 pm PST over California during summer (Bianco et al. 2011). Figure 6c indicates an over 50 m increase in \(PBLH_{max}\) in some areas of Nevada, which is up to a 10% increase based on Fig. S7. The percentage change of \(PBLH_{max}\) is generally within the range of − 5% to 5% over California (Fig. S7), with the biggest regional-mean \(PBLH_{max}\) increase of 43.44 m over MD (Fig. 6c and Table 1). Other regions with relatively large \(PBLH_{max}\) increase are GBV, SAC and SJV, where the projected regional-mean daily maximum PBLH increases are 23.14 m, 10.77 m, and 10.44 m, respectively. It is interesting that \(PBLH_{max}\) decreases over mountain ranges surrounding the CV of California, especially over the California Coastal Ranges and some parts of Sierra Nevada Mountains (Fig. 6c). Previous studies (Seidel et al. 2012; Guo et al. 2016) report that daytime PBLH is negatively correlated with surface pressure and lower tropospheric stability (LTS), which is defined as the difference in potential temperature between 700 hPa and the surface, and positively correlated with surface temperature, surface wind speed, and 500 hPa geopotential height. Figure 8 illustrates an evident future increase in summertime surface pressure and LTS over the mountain ranges surrounding the CV in California. The ws10 is projected to decrease over some parts of the mountain ranges (Fig. 8c), and the projected temperature rise is less distinct over mountain areas than other regions in California, particularly over the California Coast Ranges (Fig. 8d). All these future changes illustrated in Fig. 8 may contribute to the negative future changes in \(PBLH_{max}\) over the mountain regions shown in Fig. 6c. As a result of the high nonlinearity of the atmosphere and lack of a direct relationship between temperature increase and PBLH change, the future changes in \(PBLH_{max}\) are only statistically significant over MD (with positive changes) and Coastal Ranges (with negative changes) over California (Fig. 6c).

Fig. 8
figure 8

Future changes in a surface pressure (Ps; in Pa); b lower tropospheric stability (LTS; in K); c 10-m wind speed (in m/s); and d T2 (in K) at 2 pm PST averaged for the summer months. The thick black contours in a, b represent regions with a 95% confidence level (p < 0.05) of the future changes in Ps and LTS, respectively; whereas, the areas outside the dashed contours in d represent regions with a 95% confidence level (p < 0.05) of the future changes in T2

The projected future change in ws10 is relatively small. Minor increases in ws10 are observed over the coastal regions and off the coast of California, while decreases in ws10 are seen over the Sierra Nevada Mountains and part of GBV (Fig. 6d and Table 1). Due to the highly nonlinear relationship between changes in temperature and wind, the only broad region with statistically significant future changes in ws10 at 95% confidence level is over the Sierra Nevada Mountains (Fig. 6d). Variations in meteorological conditions play a critical role in determining summer ozone formation in California. The projected warmer future conditions are conducive to ozone formation in California; whereas, the projected overall increase in PBLH and surface wind will likely dilute ozone concentrations and affect the spatial distribution of summer ozone. Therefore, it is impossible to infer how ozone concentrations may change under projected future meteorological conditions alone. The next phase of this study will be to conduct photochemical model simulations driven by these WRF fields to investigate the influence of climate change on summer ozone concentrations over California.

4.2 Impacts of climate change on hot extremes and heat wave events

It is well known that heat extremes and heat wave events have adverse impacts on agriculture, water and energy resources, air quality, as well as human health, especially for children and the elderly. The IPCC projects that extreme heat events will intensify in magnitude, frequency and duration across the US for the twenty-first century. California has experienced several severe heat wave events in the past decade. The infamous 2006 heat wave, which affected much of the state with record-breaking temperatures and duration of the heat extremes, had substantial impact on morbidity and mortality throughout California (Knowlton et al. 2009; Ostro et al. 2009). Two severe heat wave events occurred in 2017 over California, breaking high temperature records in many locations: the first event lasted from mid to end of June and the second one started in late August and lasted through early September. Given the impact that these types of events can have on human health and the environment, developing a better understanding of the climate change impacts on hot extremes and heat wave events is essential for California, and is discussed in this section.

Figure 9a shows the 90th percentile of summertime Tmax (Tc_max_90) calculated from WRF downscaled results for the present decade. Clearly, the highest Tc_max_90 (~ 40 °C) over California occurred in MD and the northern part of SAC. Under the future climate (Figs. 9b and S8), MD and GBV are projected to experience the greatest increase in number of hot days that exceed Tc_max_90 (Figs. 9b and S9). Note that the number of days exceeding Tc_max_90 is a constant of 92 days for the present decade because there is a total number of 920 summer days (June to August) within a decade. The number of days exceeding Tc_max_90 is doubled in the future decade over MD and GBV, where the regional mean of ~ 120 more days are expected to surpass Tc_max_90 (Figs. 9b and S9). Except for BA, all of the other 8 sub-regions are expected to have an increase of over 50% (i.e. 46 days) in the number of days exceeding Tc_max_90 in the future decade. The projected future increase in number of days exceeding Tc_max_90 is statistically significant at the 95% confidence level over most of California except for northern and central coast and part of SAC. Most of the conclusions drawn in this section hold true when the temperature threshold raises from the 90th to 95th percentile (Tc_max_95; Figs. S8 and S9). Four out of nine regions, including MD, GBV, SC and MC, are projected to experience a twofold increase in number of days surpassing Tc_max_95 in the future decade (Fig. S9). Except for BA, which exhibits a 20% increase, all other regions are predicted to have a 65% or greater increase in number of days exceeding Tc_max_95.

Fig. 9
figure 9

Spatial distribution of a 90 percentile of daily maximum T2 (in K) for summer of 2003–2012; and b future changes in number of days exceeding 90 percentile of daily maximum T2 shown in a. The areas outside the dashed contours in b represent regions with a 95% confidence level (p < 0.05) of the changes in number of days

A heat wave event is typically defined as a period of consecutive days with temperature exceeding a certain threshold (Meehl and Tebaldi 2004; Perkins and Alexander 2013). In this study, heat wave events are defined as a period of at least three consecutive days in which the regional-mean Tmax exceeds the 90th percentile threshold of summertime regional-mean Tmax for the present decade (Tc_max_90_reg) calculated based on WRF downscaled results. The Tc_max_90_reg are 39.3 °C, 35.8 °C, 34 °C, 33.6 °C and 32.6 °C for MD, SAC, SC, SJV, and BA, respectively for the present decade. Our analyses are mainly focused on the regions with a higher population density (e.g. BA, SC, SAC and SJV) as more people will be impacted by heat wave events over these regions. MD is another region of interest in our heat wave analysis because previous studies demonstrated that the desert over the Southwestern US is projected to experience a larger increase in heat extremes under the changing climate than the rest of the US (Jones et al. 2015). More heat wave events are projected to occur in the future decade for all five sub-regions (Fig. 10). As indicated in Fig. 10, an over twofold increase in the number of heat wave events over MD and SC is projected for the future, and the numbers of heat wave events will be increased by 56% and 39% for SJV and SAC, respectively. In contrast, the number of heat wave events is predicted to remain almost unchanged in the future decade for BA. Also shown in Fig. 10 are future changes in the average number of days surpassing summertime Tc_max_90 in each region. The average and maximum durations of heat wave events for the present and future decades are presented in Table 2. The average durations of heat wave events are expected to increase by 1 day and 0.9 days in the future decade for MD and SJV, respectively, while a twofold increase is projected for the maximum duration of heat wave events for both regions. SAC and SC are also projected to experience longer heat wave events. Over BA, although the average duration of heat wave events is projected to increase by 0.3 days, the maximum duration of the events is expected to be reduced during the decadal periods modeled. Overall, both the frequency and duration of heat wave events are expected to increase in the future decade for almost all sub-regions analyzed in this section, within which, MD, SJV and SC are projected to experience greater impacts than other sub-regions. Among these three regions, SC, which includes Los Angeles County, is considered to be one of the most populated areas in the US with a total population of over 10 million; while SJV, which is regarded as the most agriculturally productive region in the US (Schoups et al. 2005), is home to over 4 million people. Hence, the remarkable future changes in hot extremes and heat wave events discussed in this section could result in substantial social and economic impact. Compared to all other regions, a much smaller impact on heat wave events from climate change is expected in BA, which may be associated with the predicted changes in MAP events.

Fig. 10
figure 10

Numbers of heat wave events for the present and future decades for SJV, SC, SAC, BA and MD. The numbers shown on the bar charts for each sub-region are future changes in number of days exceeding Tc_max_90 for the present decade shown in Fig. 9a averaged over each region

Table 2 Average and maximum duration of heat wave events for the present (2003–2012) and future (2046–2055) decade (in days)

4.3 Future changes in marine air penetration events

MAP events are a coastal phenomenon during which cool and moist marine air penetrates inland through one or more passes or corridors in the regional terrain (Fosberg and Schroeder 1966; Olsson et al. 1973). For California, the MAP events refer to days when the daily sea breeze circulation is enhanced by the Pacific coast monsoon (Read 1971) and Delta Breeze (Fosberg and Schroeder 1966) in the northern and central California coast allowing the marine air to penetrate further inland than regular days with sea breeze only. MAP events in California occur from late spring to early fall, with peaks in both frequency and strength during the summer months. During MAP events, marine air penetrates inland through the Carquinez Strait, and then the marine airflow splits into northward flows up the SAC valley and southward flows down the SJV due to the blocking effect of the Sierra Nevada mountain range (Fig. S10a). It was demonstrated in the previous section that California will likely experience more extreme hot days and heat wave events in future decades on top of the overall warming trend during summer months. Under such a circumstance, MAP events are crucial for California because during these events cool and humid marine air intrudes to San Francisco Bay Area region, SJV and SAC, which brings cooling relief to these densely populated regions on hot summer days.

Wang and Ullrich (2018) developed a criterion that is a combination of temperature difference along the CV and 900 hPa on-shore wind speed to identify MAP events occurring in BA and the CV. Using a similar approach, we define a day to be influenced by MAP when (1) the difference in Tmax between the WRF grids encompassing the cities of Lodi (38.1° N, 121.4° W) and Fresno (36.8° N, 119.7° W) exceeds 5 K, and (2) 900 hPa onshore wind speed at the WRF grid encompassing the city of Oakland (37.8° N, 122.27° W) is greater than 4 m/s. The onshore winds around BA are winds with wind directions ranging from 150° to 330°. The locations of these three cities along with the signature wind pattern during a typical MAP day are shown in Fig. S10a, while Fig. S10b illustrates stronger daytime marine air intruding into SJV and SAC through BA in the future. A positive future trend is observed for the occurrence of MAP days using the criteria defined above. Out of 920 summer days for the 10-year simulation period, 505 days and 527 days are identified as MAP days for the present and future decades, respectively. Figure 11 shows daytime T2 and 900 hPa wind averaged over MAP days and non-MAP days for the present decade. It is obvious that the wind patterns are substantially different between MAP and non-MAP days. Figure 11a, b, as well as Fig. S11, illustrate that the winds are enhanced passing through the narrow Carquinez Strait on MAP days, resulting in stronger westerly winds over BA and southerly wind over SAC comparing to those for non-MAP days. Instead of strong southerly wind over SAC for MAP days (Fig. 11a), weak northerly winds are observed for non-MAP days over this region (Fig. 11b). The daytime mean T2 are 3 K and 1.5 K lower averaged over MAP days than over non-MAP days for BA and SAC, respectively (Table S3). The difference (0.4 K) in daytime mean T2 between MAP and non-MAP days over SJV is much less appreciable than that over BA and SAC (Table S2 and Fig. S11), which can be attributed to the fact that marine air often only penetrates to the northern part of SJV during MAP days as the length in the south-north direction of the SJV is substantially longer than that of the SAC.

Fig. 11
figure 11

Daytime (10 am to 6 pm PST) 2-m air temperature (in C) overlaid with 900 hPa wind vector (in m/s) averaged for a MAP days; b non-MAP days for the summer months of the present 10-year simulation period; c the same as a but for the future 10-year simulation period; and d Difference in 900 hPa wind vector (in m/s) averaged for MAP days identified for the future and current 10-year simulation period overlaid with terrain height (in m)

The difference of the mean 900 hPa wind field on MAP days between future and present simulation periods (Fig. 11d) demonstrates that all the three signature flow patterns (onshore wind in BA, northward flows in SAC, and southward flows in SJV) during MAP events are slightly intensified in the future decade, indicating overall stronger MAP events in the 2050 s than the present decade. Wang and Ullrich (2018) pointed out that the presence of the coastal trough through the Pacific Northwest is one main driver of MAP events, because this trough can effectively modify the contours of the 700 hPa geopotential height field to be approximately perpendicular to the west coast, leading to geostrophic wind directed into the BA. The presence of the 700 hPa trough is evident for both the current (Fig. 12a) and future decades (Fig. 12b). Figure 12c suggests a stronger trough in the future decade compared to the present decade, leading to stronger gradients of geopotential heights around BA in the future decade and subsequently stronger geostrophic wind penetrating into the CV through BA. The greater temperature increase over the warm area centered in Colorado and New Mexico compared to the increase over the ocean shown in Fig. 12 intensifies the synoptic-scale temperature contrast between land and ocean, resulting in a stronger 700 hPa trough over the Pacific Northwest (Fig. 12c). Figure 12d demonstrates that the future changes in 700 hPa geopotential heights are statistically significant at the 95% confidence level along and off the coast of California. The intensified 700 hPa trough in the future decade can at least partially explain the more frequent and stronger MAP events in the future. Note that the plots in Fig. 12 are based on the results from the outermost WRF domain in order to show the feature of the entire trough. Overall, our results predict future increases in both frequency and intensity of MAP events, which may help explain the smaller future change in the hot extremes and heat wave events over BA and SAC compared to other regions shown in Sect. 4.2. As a result of the future changes in MAP events, the ventilation is also expected to increase in SAC and SJV in the 2050s, which may dilute local emissions, while at the same time enhancing transport of pollutants from BA to SJV and SAC.

Fig. 12
figure 12

Summer mean daytime (10am to 6 pm PST) 700 hPa temperature (filled; in C) and geopotential height (contours; in m) for a current decade, and b future decade; c differences (future decade-present decade) in summer mean daytime 700 hPa temperature (filled color; in K) and geopotential height (contours; in m); and d differences in summer mean daytime 700 hPa geopotential height (filled color; in m). The areas outside the dashed contours in d represent regions with a 95% confidence level (p < 0.05) of the future changes in summer mean daytime 700 hPa geopotential height. Note that areas with missing data are masked with white in d

5 Concluding remarks

In this study, we have investigated the climate change impact on meteorological variables and extreme events for different geographical sub-regions in California by dynamically downscaling CESM data from CMIP5 simulations with the RCP6.0 emission scenario using the WRF model. The biases of CESM data are corrected based on NARR reanalysis data prior to the downscaling process. Both the climatological mean and inter-annual variance biases of CESM variables are corrected. Our results have shown that correcting both is better to resemble the extremes (e.g., heat wave, drought) inherited in NARR reanalysis than correcting the climatological mean only. In this study, the periods of 2003–2012 and 2046–2055 were selected to represent a present and future decadal period. Relationships for the climatological mean and standard deviation between CESM and NARR variables for 2003–2012 were developed and validated for the corresponding CESM and NARR variables for two validation periods. The validation results show that the relationships remained relatively constant with time, and these statistical relationships were then applied to both the present and future periods to correct the bias of the CESM data.

This paper focuses on the evaluation of the WRF-based dynamically downscaled results and the investigation of future projections of meteorological variables and phenomena in California for summertime. The comparisons among the three sets of WRF results driven by bias corrected CESM data (D_BCCESM), the original CESM (D_OCESM), and NARR reanalysis data (D_NARR) show that for summertime, the positive difference in surface temperature and relative humidity over California from D_OCESM compared to D_NARR are largely reduced (over 50% over several sub-regions) after the use of the bias corrected CESM data. Compared to the original CESM data, the WRF downscaling results driven by the bias corrected CESM data agree better with those driven by NARR reanalysis, including the vertical profiles of summer-mean temperature, water vapor mixing ratio, geopotential height, and wind speed averaged over the analysis domain. Comparisons between WRF downscaled results and surface observations also confirm that D_BCCESM has an overall better model performance than D_OCESM.

Regarding the 2046–2055 projection, our results show an increase of summer-mean surface temperature ranging from 0.5 to 1.5 K for the WRF analysis domain. The regional mean summertime surface temperature is projected to increase over 0.9 K for 7 out of 9 geographical sub-regions over California in the future decade, among which the increase over GBV and MD is over 1.1 K. The projected temperature increase is statistically significant at the 95% confidence level over almost the entire analysis domain. Our results also show greater nighttime increase in surface temperature than daytime increase inland, and subsequent daily temperature range (DTR) reduction. The probability distributions for statewide daily mean, maximum and minimum T2 show shift of the mean and skewness of each toward the hotter part of the distribution. A future increase in Qv2 is also observed over California and Pacific Ocean. Sub-regions with statistically significant Qv2 increases over California are coastal regions (BA and NC) and SAC. PBLH is projected to increase in the future decade for most regions over California. However, future reduction in PBLH is observed over mountain ranges in California, which could be attributed to the projected future increases in summertime surface pressure and lower tropospheric stability (LTS) over the mountain ranges.

Climate change impacts on hot extremes and heat wave events were also assessed in this study. Except for BA, all of the other eight sub-regions are projected to experience an over 50% increase in the number of days surpassing the 90th percentile of the regional Tmax in the present decade by the 2050s, among which the number of days is expected to double over MD and GBV. Analysis of heat wave events was focused on sub-regions with high population density (SAC, SC, SJV, BA) and the MD arid region. Both the frequency and duration of the heat wave events are projected to increase in the future over these five sub-regions. Our WRF downscaling results project an over twofold increase in the number of heat wave events over MD and SC for the future simulation period. Future increases of 56% and 39% in the number of heat wave events are also predicted for SJV and SAC, respectively. Furthermore, the average duration of the heat wave events is projected to increase by ~ 1 day over MD, SJV and SAC, which accounts for an over 20% increase based on the average duration of the heat wave events (about 4 days) in the present decade. The projected future increase in the number and duration of heat wave events over these high-populated regions will be a major threat to public health. In addition, this will likely exacerbate wildfires and drought conditions over these regions. Our results also show a positive future trend in both the number of MAP days and the intensity of the events over northern California, implying that more cool and moist marine air will penetrate to SAC and SJV through San Francisco Bay Area in the future. These future changes in MAP events can be induced by the projected changes in the coastal trough throughout the Pacific Northwest and its strength is predicted to increase around BA, resulting in stronger geostrophic winds penetrating into the valley through BA.

To the best of our knowledge, this is the first comprehensive climate change study using dynamically downscaled bias-corrected CESM results from CMIP5 on to a 4 × 4 km2 grid over California via WRF model, followed by detailed analysis on potential climate change impacts on meteorological variables and phenomena over different geographical sub-regions in California. The results from this study can help the public and policy makers better understand how meteorology relevant to air quality may be altered under a changing climate, and what effect those changes may be expected to have on air quality, as well as demonstrating the importance of greenhouse gas mitigation in preserving a viable future for California. The regional scale meteorological data with a fine resolution from this study coupled with subsequent photochemical modeling will provide air quality predictions for the future, which would help policy makers in developing strategies for mitigating the effect of climate change on future air quality (e.g., additional strategies to achieve even greater emission reductions). It should be noted that this work represents only a single future climate realization, and different assumptions about future trends in greenhouse gas emissions scenarios will result in different future climate realizations, which may produce differing effects on future regional meteorology and air quality.