1 Introduction

China lies on the east part of the Asian continent, west coast of the Pacific. Moisture in the atmosphere is mainly supplied by the moist and warm southwesterly monsoon flow. The west side of the country is the highest with the Tibetan Plateau (TP), and the topography decreases in cascades towards the east. This specific geographical position and topographical distribution determines the general precipitation climatology in China, while the detailed features of precipitation are results of complex interactions of multiple processes. The most important influence comes from the monsoons. During the warm season (May–September), the East Asian summer monsoon dictates the sequential formation of three major rainy seasons in China: the early summer rainy season in southern China (May to mid-June), the Mei-yu season over the Yangtze and Huai River basins (mid-June to mid-July), and the late summer rainy season in northern China (late July and August) (Ding 2013).

There are other important influential factors. Ocean temperature anomalies associated with El Nino and La Nina events, for example, affect the temperature and precipitation distribution in the next few seasons. The warm pool in West Pacific and the Indian Ocean also play a role in modulating regional circulation and precipitation (Gu et al. 2018; Yuan et al. 2019). The Tibetan Plateau exerts significant dynamic and thermodynamic forcing over the large-scale circulation and affects the monsoon rain bands and moisture transport (Li et al. 2014; Wan et al. 2017; Wu and Chen 1985; Xu et al. 2015). The snow pack over the Plateau has been shown to modulate precipitation over the upper and middle reaches of the Yangtze River basin (Zhang et al. 2004). Just like the Rocky Mountains in North America, the low-level vortices steered out from the Plateau can act as important rain-bearing synoptic systems over eastern China, and cause excessively heavy rainfall (Tao and Ding 1981; Wang et al. 2012). Anthropogenic influences such as the greenhouse-gases-induced global warming, aerosol emissions, and land use changes due to deforestation and urbanization, have also gained broad recognition (Wang and Li 2007; Wu et al. 2019). Besides, China is one of the countries suffering the most from tropical cyclones (TC). TC landfalls are most likely to happen from July to September in the southeast coast of China (Li et al. 2004), causing widespread heavy precipitation and flooding along the coastal areas and even far inland.

The aforementioned processes collectively dictate an overall southeast-high-and-northwest-low distribution in the seasonal precipitation. For most regions, precipitation is concentrated in the summer when the monsoon prevails. Extreme precipitation is most intense over the south coast and the North China Plain (NCP), in association with TC, surface front, low-level vortex/shear line, and weak-synoptic forcing (Luo et al. 2016). The diurnal cycle of precipitation has significant regional variations. For instance, precipitation peak time varies from midnight to late afternoon from the edge of the TP to the lower Yangtze River valley (Yu and Li 2016; Yu et al. 2007).

The simulation of precipitation in China with general circulation models (GCMs) has been problematic. It is well known that GCMs tend to predict precipitation that is “too-frequent-too-light” (Demott et al. 2007; Stephens et al. 2010), and this is true over East Asia as well (Zhang and Chen 2016). Also, global models overestimate precipitation over the TP and its southern and eastern peripheries, while underestimate precipitation in southeast China (Zhang and Chen 2016). Moreover, GCMs tend to phase-lock convective precipitation to local noon (Dai 2006) due to deficient convective parameterization and inability in simulating organized propagating storms, or mesoscale convective systems (MCSs) (Kooperman et al. 2013; Moncrieff 1992). For example, over the eastern China plain, the CAM5 model simulates a noontime maximum of convective rainfall, which is earlier than observation. Over the TP, the CAM5 model simulates a late-afternoon peak of rainfall when the observed rainfall peaks at night. The amplitude of the diurnal variation is also under-represented over the TP and western China plain (Yuan et al. 2013). Simulating the northward migration of rain band in the East Asian summer monsoon region is another challenge to GCMs (Chen et al. 2010).

People have been using regional climate models (RCMs) to dynamically downscale the GCM output to improve the prediction of precipitation (Dickinson et al. 1989; Giorgi and Bates 1989). Many studies demonstrated that RCMs are able to better capture the spatio-temporal distribution of precipitation in some regions (Boberg et al. 2010; Mass et al. 2002; Rauscher et al. 2010; Salathé et al. 2008). However, as in global models, large errors in the phase and amplitude of the diurnal cycle still exist (Dirmeyer et al. 2011; Yuan et al. 2013). Most of existing RCMs use grid spacing of 10 km or larger and thus require the use of cumulus parameterization, which is regarded as one of the largest sources of uncertainty for precipitation simulation (Brockhaus et al. 2008; Dai et al. 1999; Liang et al. 2004; Molinari and Dudek 1992; Weisman et al. 1997). Conventional cumulus parameterization is also problematic when the resolution gets to the so-called “grey-zone” (Arakawa and Jung 2011; Gao et al. 2017; Grell and Freitas 2013; Yun et al. 2017).

Advances in computing resource have made it possible to run regional climate simulations with “explicit” convection at grid spacing of 4-km or less (Prein et al. 2015; Weisman et al. 1997). Such so-called “convection-permitting” models (CPMs) are able to avoid the error-prone cumulus parameterization, and better represent small-scale physical processes, complex terrain, and land–atmosphere coupling. Studies have shown that CPMs improve the simulation of the orographic precipitation (Ikeda et al. 2010; Liu et al. 2011; Rasmussen et al. 2011, 2014), the precipitation diurnal cycle (Ban et al. 2014; Guichard et al. 2004; Pearson et al. 2014; Sato et al. 2009), the intensity and frequency of precipitation (Berthou et al. 2018; Demott et al. 2007; Kendon et al. 2012, 2017; Stephens et al. 2010; Weusthoff et al. 2010), the magnitude of extreme precipitation (Ban et al. 2014; Chan et al. 2014; Kendon et al. 2014; Li et al. 2012; Sun et al. 2016), organized propagating storms, or MCSs (Kooperman et al. 2013; Liu et al. 2006; Luo and Chen 2015; Prein et al. 2017; Pritchard et al. 2011), and the Madden–Julian oscillation (Benedict and Randall 2009; Miura et al. 2007). Convection-permitting simulations have been applied over several regions of the world to improve regional climate modeling and projection, such as North America (Liu et al. 2017; Wang et al. 2018), Europe (Ban et al. 2014; Berthou et al. 2018; Fosser et al. 2015; Kendon et al. 2014), Africa (Marsham et al. 2013; Pearson et al. 2014; Stein et al. 2015; Stratton et al. 2018), Australia (Evans et al. 2014), and the Maritime Continent (Birch et al. 2016).

As in other parts of the world, previous short-term CPM simulations over China have demonstrated the potential of high-resolution simulation in this region (Li et al. 2018; Zhu et al. 2018). However, CPMs have not been robustly tested over East Asia via multi-year simulations, which are necessary to evaluate the model across inter-annual variations. This has motivated us to perform convection-permitting regional climate simulations over Eastern China for 10 warm seasons (2008–2017) to: (1) evaluate the CPM skill in simulating warm-season precipitation climatology in the East Asian monsoon region, (2) to identify the promises and problems of the convection-permitting simulation, and (3) to investigate solutions for the model deficiencies. The layout of the paper is as follows: model setup and data will be described in Sect. 2; the evaluation will be presented in Sect. 3, followed by an analysis of the modeled precipitation bias and its mitigation in Sect. 4; finally a summary is given in Sect. 5.

2 Numerical model and data

2.1 Model

We use the Weather Research and Forecasting (WRF) model version V3.9.1.1 (Skamarock et al. 2008) with a single domain with 3-km grid spacing and 51 vertical levels. Key physics options include Morrison 2-moment scheme (Morrison et al. 2005) for microphysics, Mellor–Yamada–Janjic (MYJ) scheme (Janjic 1994) for planetary boundary layer (PBL), Rapid Radiative Transfer Model for GCMs (RRTMG) scheme (Iacono et al. 2008) for shortwave and longwave radiation, and the community Noah land surface model with multiparameterization options (Noah-MP; Niu et al. 2011) for land surface. Both cumulus parameterization and shallow convection scheme are turned off. This physics configuration is selected based on one-season tests with different microphysics and PBL scheme options. As shown in Fig. 1, the domain covers an area of 2622 km × 1980 km with 874 × 660 grid points. It covers major topography and precipitation regions in East China, such as the eastern part of the TP, the Yun-Gui Plateau, the Sichuan Basin, the Mei-yu region along the Yangtze River basin, and southern China. We conduct 10-year (2008–2017) 5-month simulations between April and August. The first month in each simulation is treated as spin-up time, and our analysis focuses on the May–August period. The initial and boundary forcing data is from ERA5 reanalysis (ECMWF 2017), which has a grid spacing of 31-km and hourly time interval.

Fig. 1
figure 1

Simulation domain. Surface elevation is shown as shaded contours (unit: m). Station locations are marked as black dots. The four boxes are the regions of interest for diurnal cycle analysis. Box A: Sichuan Basin; Box B: Mei-yu region; Box C: Yun-Gui plateau; Box D: southern China. Also marked are the key geographical features that will be mentioned in subsequent sections

2.2 Validation data

We use three sets of data to evaluate the simulation result. The first is station data obtained from the National Meteorological Information Center (NMIC) of the China Meteorological Administration (CMA; Zhang et al. 2016). The data contains hourly precipitation at 2400+ national weather stations covering Mainland China and Hainan Island. It is collected from automated weather stations (AWS), which is only available after 2000. The distribution of the stations is shown in Fig. 1. On the whole, the coverage is dense over eastern China and relatively coarse over the Plateaus and mountains. The quality control procedure includes climatological limit value test, internal consistency test, and space/time continuity test. The annual mean correct rate of the data is above 99%. The difference between automated and manually observed daily accumulated precipitation is less than 1 mm for 92.1% of the stations. Only 0.9% of the stations have differences greater than 5 mm. Most of the suspicious data lies on the TP and the mountainous regions.

The second is a gridded dataset (CN05.1; Wu and Gao 2013), which provides daily and monthly mean precipitation and temperature at 0.25° × 0.25°. It is interpolated from the 2400+ national weather station data. The CN05.1 dataset is constructed using the “anomaly approach” as detailed in Xu et al. (2009), but with more station observations. In the “anomaly approach”, a gridded climatology is calculated first, and then a gridded daily anomaly is added to the climatology to obtain the final dataset. The accuracy of the dataset is determined by the underlying sparsity of the original station data. Comparison with other gridded datasets confirms that the uncertainty is smaller in eastern China and the plain regions, and larger in the west mountainous regions (Wu and Gao 2013).

The third is a gauge-satellite merged dataset (AWS-CMORPH; Shen et al. 2014), merged from 30,000+ automated national and regional weather station observations and the NOAA Climate Precipitation Center Morphing (CMORPH) precipitation product (Joyce et al. 2004). The data has an hourly frequency, and a spatial resolution of 0.1° × 0.1°. Since CMORPH data underestimate precipitation in eastern China (Shen et al. 2010), the merged AWS-CMORPH data also has a low bias, albeit much improved. The mean bias of the merged AWS-CMORPH data is − 0.012 mm/h from May to September 2008–2010. However, the bias increases with precipitation intensity. For precipitation intensity 8.0–16.0 mm/h and > 16.0 mm/h, the bias is − 1.15 mm/h and − 2.72 mm/h, respectively (Shen et al. 2014). In terms of spatial distributions, the errors are larger in the west where the station distribution is comparatively sparse. This dataset is only available for the warm season (May–September), due to the lack of AWS observations in winter in northern and western China.

When comparing with observed precipitation, only model rain rates larger than 0.1 mm/h are considered due to the observation limit. For comparison with the station data, model results are interpolated to the station locations using the nearest neighbor method. However, differences may still arise from the issue that model results represent the average over a grid box, while station data is point observation. If more than 10% of the data is missing for a station, this station is discarded. To compare with the gridded datasets, model-simulated precipitation is interpolated to the grid of the observational data while conserving total precipitation.

3 Results

3.1 Seasonal and sub-seasonal rainfall

We first examine the spatial distribution of the simulated monthly precipitation averaged over 10 warm seasons of 2008–2017 (Fig. 2). For each month, the model reproduces the spatial distribution shown in observations. For example, the May–June rainy region in southern China and the middle and lower reaches of the Yangtze River, and the July–August rain band along the foothills of the Plateau and along the southeast coast are all well captured by the model. The major model deficiency is the high bias in simulated precipitation in most regions, similar to what Li et al. (2018) and Zhu et al. (2018) showed in their high resolution simulations. The domain-averaged monthly overestimation ranges from 16.5 to 21.3%, with the largest bias in June. The wet bias is more pronounced over the Tibetan and Yun-Gui Plateaus on the west side of the domain and over the southern mountains. This spatial feature could be in part due to the sparsity and relatively lower quality of observations at these mountainous sites. In June, July and August, precipitation is also overestimated over the North China Plain.

Fig. 2
figure 2

Average precipitation (colored shading) for each month (mm/day) from model (left), CN05.1 observation (middle), and their percentage differences (right). The numbers inside each plot are the domain-averaged values. The black–white contour lines show the terrain height in meters. From white to black the terrain heights are 100, 500, 1000, and 3000 m

The seasonal average precipitation distribution (Fig. 3) shows similar features in model performance. The model well captures the spatial pattern, but overestimates the amount. The overestimation of the 10-year warm-season mean precipitation is 18.8% compared to CN05.1 data, with the largest bias mainly located over the Tibetan and Yun-Gui Plateaus, the North China Plain, and the southern mountains. The convection-permitting WRF model skill at simulating the seasonal precipitation patterns is further confirmed by the validation against station observations. It shows a high pattern correlation at 0.83 and low root mean square error (RMSE) of 0.068 mm/day. Nevertheless, an overestimation is present at the majority of the station sites with an average bias of 20.5%.

Fig. 3
figure 3

2008–2017 Seasonal (May–August) average precipitation (mm/day) from a model, b CN05.1 observation, and c their difference; seasonal average precipitation at station locations from d model, e station observation, and f their scatter plot comparison. The numbers inside each plot are the domain-averaged values. The black–white contour lines show the terrain height in meters. From white to black the terrain heights are 100, 500, 1000, and 3000 m

The seasonal mean precipitation for each year averaged over the entire domain is shown in Fig. 4. The inter-annual variation of precipitation is generally reproduced by the model. For example, the fluctuation of seasonal mean precipitation during 2008–2011, the gradual increase of precipitation from 2011 to 2015, and the decrease afterwards are all properly captured by the model. The correlation coefficient between the modeled and observed time series is 0.66. A high bias persists throughout the 10 seasons. An interesting feature is the smaller difference between model and observation over the recent 5 seasons (i.e., after 2013) than prior seasons. An examination of the spatial distribution of the model bias shows that the bias reduction mostly occurs over the TP (figure not shown). This suggests that the reduced model bias in recent years may be a consequence of the improvement of the observation quality in the mountainous region.

Fig. 4
figure 4

Seasonal (May–August) mean precipitation (mm/day) for each year from model (blue) and CN05.1 observation (red)

3.2 Frequency and intensity of rainfall

Figures 5 and 6 compare the intensity and frequency of rainfall from the model and station observations for wet days (daily accumulation > 1 mm at station location) and heavy rain days (daily accumulation > 25 mm at station location), respectively. Frequency is defined as the percentage of the total number of days. The observed rainfall in wet days (Fig. 5b) shows an evident increasing trend from north to south. For most regions, precipitation occurs on 20–40% of the days, while certain parts of the Plateaus and southern China witness the highest precipitation occurrence of > 40%. The model (Fig. 5a) generally agrees with the observation in terms of spatial distributions, despite a prevalent over-prediction. The modeled precipitation frequency could amount to more than 50% of days in some regions, such as the southeast coast, and parts of the southern and western mountains. The overestimation of frequency is widespread over the domain, with a high value region on the west edge of the Sichuan Basin. The percentage difference (Fig. 5c) shows an average overestimate of about 16%, with the highest bias located on the west Sichuan Basin and North China Plain.

Fig. 5
figure 5

Wet day (daily accumulation > 1 mm) frequency (% of total days, top row) and intensity (mm/day, bottom row) from model (left column), station observation (middle column), and their percentage difference (right column). The numbers inside each plot are the domain-averaged values. The black–white contour lines show the terrain height in meters. From white to black the terrain heights are 100, 500, 1000, and 3000 m

Fig. 6
figure 6

Heavy rain day (daily accumulation > 25 mm) frequency (% of total days, top row) and intensity (mm/day, bottom row) from model (left column), station observation (middle column), and their percentage differences (right column). The numbers inside each plot are the domain-averaged values. The black–white contour lines show the terrain height in meters. From white to black the terrain heights are 100, 500, 1000, and 3000 m

The distribution of precipitation intensity (daily precipitation amount averaged over all wet days; Fig. 5, lower panels) is different from that of precipitation frequency. As such, the regions with frequent precipitation are not necessarily collocated with intense precipitation. The middle-lower reaches of the Yangtze River valley receives the most intense precipitation with average rain rates > 20 mm/day, consistent with high seasonal precipitation therein (Fig. 3e), although it is not a frequent precipitation region. High intensity precipitation is also seen along the southeast coast, which, concurrent with high frequency, accounts for the largest amount of seasonal precipitation in the domain (Fig. 3e). In contrast, the Plateaus witness frequent precipitation, but with weak intensity (< 10 mm/day), implying abundant drizzles or light rains. Rain intensity is better simulated than frequency in terms of the areal mean error. The model overestimates precipitation intensity over similar region where seasonal mean rainfall is overestimated (Fig. 3). There are also numerous regions witnessing a low intensity bias, such as the central and eastern Basin, and the southeast coast. The coastal underestimation is probably related to the biased representation of boundary layer southerly jets and/or land-sea circulations due to the insufficient resolution, both playing important roles in shaping convection development at the coast and nearby inland where synoptic forcing is weak (Du and Chen 2018, 2019).

The model and observation show a good agreement: heavy rain days (Fig. 6) account for about one-fifth of total rainy days, but the mean intensity is more than 3 times larger. For heavy rain days, the intensity distribution is similar to that of all rain days, but the frequency distribution has more regional variability because of the relative rareness. Heavy rain days are most frequently observed along the southeast coast, the west side of the Wu-Yi Mountains, and the lower Yangtze-River valley. They are least likely to occur in the northwest region of the domain (Fig. 6b). The regions with overmuch heavy rain days (Fig. 6c) are often coincident with high bias in total wet days and overall precipitation intensity (Fig. 5c, f). Such regions include the Plateaus, the southern mountains, and the North China Plain. The bias in heavy precipitation intensity is much smaller, most noticeable over the North China Plain.

In brief, for wet days and heavy rain days, the model reproduces their different distribution patterns, even though both the precipitation intensity and frequency are high biased at the majority of the station sites. The biases for heavy rain days are larger than those for wet days. In addition, overly frequent precipitation, especially too numerous heavy rains, seems to be the leading contributor to the model wet bias.

Figure 7 displays the statistics for the occurrence frequency of different daily and hourly rain categories. The good agreement between the modeled and observed probability distribution provides further evidence on the model’s ability at generating the daily and sub-daily precipitation. Regardless, the model underestimates the occurrence of the smallest rain category, such as clear or cloudy days (< 1 mm/day), and < 0.2 mm/h hourly rain rate, but overestimates the frequency of all larger precipitation categories. The overestimations of both daily and hourly rain rates feature a U-shaped pattern with the minimum bias for moderate rains with intensity of 1–5 mm/h or 10–25 mm/day, and the maximum bias for extreme events. These results confirm that overly modeled seasonal precipitation is mainly caused by too-abundant intense rainfall events.

Fig. 7
figure 7

Occurrence frequency (%) for a daily and b hourly precipitation from model (blue) and station observation (red). The numbers are the percentage differences between model and observation [(model − obs)/obs × 100%]. The occurrence frequencies of daily and hourly precipitation are calculated relative to the total number of days or hours, respectively

3.3 Seasonal migration of rain band

The seasonal meridional migration of rain band is a distinct feature of precipitation in China. During May and early June, with the onset of the East Asian summer monsoon, heavy rain is mostly confined to South China. Thereafter the rain band marches northward following the monsoon advance. Around mid-June, it reaches the Yangtze River basin, and the Mei-yu season starts. In early or mid-July, the rain band jumps to the Yellow River basin. At the same time, the subtropical high seeks control over the Yangtze-Huaihe region and leaves the climate there hot and dry. While at the southern border of the subtropical high, southern China is influenced by tropical storms and typhoons and once again becomes a rainy region. In late July and early August, the rain band reaches northeast China, the farthest north point in its seasonal migration path. Then in September, due to the influence of the cold air in the north and the retreat of the southwesterly monsoon flow, the rain band quickly retreats to the south.

We evaluate the simulation of the meridional displacement of rain band by comparing the daily precipitation anomaly (relative to the warm-season mean, zonally averaged between 110° E and 123° E) in eastern China from the model and AWS-CMORPH observation (Fig. 8). As shown in the time-latitude diagram, during the pre-summer rainy season in May, the rain band lies from 20° N to 28° N, while heavy rainfall mainly resides south of 26° N. Starting from June, the simulated rain band starts to move northward, and reaches 30° N in mid-June, indicative of the rainy season along the Yangtze River valley. At the end of July, precipitation is concentrated north of 32° N in North China, and south of 24° N in South China. The latter is linked to the effect of tropical weather systems such as tropical cyclones. Comparing with the observation, the model successfully reproduces the latitudinal span, the temporal duration, and the migration characteristics of rain band, although the anomaly magnitude is somewhat larger. However, this could partly be due to the low bias in the AWS-CMORPH data (Shen et al. 2014).

Fig. 8
figure 8

Time-latitude distributions of zonally averaged (110° E–123° E) daily precipitation anomaly relative to the seasonal mean from a the model and b AWS-CMORPH observation (mm/day)

3.4 Precipitation diurnal cycle

3.4.1 Spatial patterns

The diurnal cycle of precipitation is caused by solar heating and is modulated by regional and synoptic-scale circulation and thermodynamic conditions. Due to its large amplitude and short timescale, diurnal cycle provides a good test bed for weather and climate models (Dai and Trenberth 2004). To analyze the simulated diurnal cycle, we first interpolate the model results to the observation grid, and then use the hourly precipitation of the entire simulation period to compute a 10 warm-season composite diurnal cycle for each location. If more than 10% of the observation value is missing for a particular hour, the mean rain rate of this hour is set to missing. If any of the 24-h composite rain rates is missing, the maximum precipitation amount/frequency time are not computed for this location. This way, we can prevent calculating erroneous diurnal cycles over places where there are frequent missing observations, for example, over the TP. All the following diurnal cycle analysis is based on local solar time (LST).

The diurnal cycle of precipitation in China exhibits distinct regional characteristics. Over the Plateau and Sichuan Basin, precipitation features nocturnal peaks (Fig. 9b). The peak time gradually delays to the morning and then transitions to the afternoon as it goes further to the east. Late afternoon peak prevails east of 110° E, while scattered night-morning peaks exist in the lower reaches of the Yangtze River valley, northern China, and coastal areas. The widespread late afternoon peak in the domain is a well-known result of accumulated daytime solar heating at the surface, which leads to the highest low-level instability and thus the maximum potential for convection initiation in the afternoon. The leading mechanism for the nocturnal peak of precipitation in Sichuan Basin is still an open-ended question. Previous studies have attributed the nighttime precipitation preference to (1) the upslope/downslope-wind-induced daily alteration of the low-level divergence and convergence in connection with the surrounding elevated daytime heating and nighttime cooling (Huang et al. 2007; Jin et al. 2013), (2) the eastward propagating convective systems from the Plateau (Bai et al. 2008; Qian et al. 2015), (3) the cloud top radiative cooling during the night (Yuan et al. 2013), and (4) the diurnal inertial oscillations of south-southwesterly low-level jet into the basin (Zhang et al. 2016).

Fig. 9
figure 9

a, b Maximum precipitation amount time (LST), c, d maximum precipitation frequency time (LST), e, f and amplitude of diurnal variation (%). The left and right panels are for model and station observation, respectively. The black–white contour lines show the terrain height in meters. From white to black the terrain heights are 100, 500, 1000, and 3000 m

The model captures most of the regional variability, particularly the night-morning transition from the lee side of the Plateaus through the Yangtze River basin, and the broad afternoon peaks (Fig. 9a). But the model over-predicts the occurrence of the afternoon peak across the domain, and under-predicts the occurrence of the night precipitation peak in central Yun-Gui Plateau and the east side of the Sichuan Basin. The distribution of maximum frequency time (Fig. 9d) is similar to that of maximum precipitation time, except for the more popular early-morning timing of peak frequency in North China. The modeled precipitation is too frequent in the afternoon, especially east of 108° E.

Besides the timing of maximum precipitation, amplitude is another often-used parameter for characterizing diurnal variations. The amplitude of diurnal variation is calculated as:

$$ A = \frac{{P_{max} - P_{mean} }}{{P_{mean} }} \times 100\% . $$
(1)

For each station location, Pmax and Pmean are the maximum and average hourly precipitation in the composite diurnal cycle, respectively. From observation (Fig. 9f), the regions with the largest amplitude (exceeding 100%) are the southern China, Hainan Island, and the west side of the Sichuan Basin. These strong diurnal signals may be related to the land-sea breeze over the island and coastal regions, and the mountain-valley breeze over the steep terrain between the Plateau and the Basin (Mao and Wu 2012). The model does an excellent job in reproducing the observed geographical characteristics of diurnal amplitude, notwithstanding a slight over-prediction in the southern coastal area and under-prediction on the west side of the Basin (Fig. 9e).

3.4.2 Regional features

In view of the complex spatial patterns, we select four regions for more detailed investigations: the Sichuan Basin (28° N–32° N, 103° E–107° E), the Mei-yu region (27° N–32° N, 112° E–120° E), the Yun-Gui Plateau (22° N–26° N, 100° E–106° E), and southern China (22° N–27° N, 110° E–120° E). The locations of the four regions are shown in Fig. 1. In Fig. 10 the diurnal cycle is depicted by the hourly precipitation anomaly from the daily mean value. As the observation shows, the Sichuan Basin features a single nocturnal peak in precipitation; the Mei-yu region has two peaks, one in the early morning, the other in the mid-afternoon; the Yun-Gui Plateau displays a double peak as well, but is dominated by the late night peak; southern China is dominated by a mid-afternoon peak.

Fig. 10
figure 10

Diurnal cycle over four sub-regions: a Sichuan-Basin, b Mei-yu Region, c Yun-Gui Plateau, and d southern China. The lines are precipitation anomaly with respect to daily mean (mm/h) from model (orange) and AWS-CMORPH observation (black)

It is readily seen that the model captures the different regional features of the diurnal cycle reasonably well. Nevertheless, there are apparent deviations from observations, especially in the afternoon precipitation. For Sichuan Basin, the model captures the nocturnal peak, but produces a secondary peak in the afternoon, which may be caused by enhanced mountain solar heating (Jin et al. 2013). For the Mei-yu Region, the model reproduces the morning peak successfully, but exaggerates the afternoon precipitation. For the Yun-Gui Plateau, the model under-predicts the late-night-to-morning precipitation but over-predicts the afternoon precipitation. A similar model deficiency is also notable for southern China. In general, the model tends to over-predict the afternoon precipitation, in line with the over-prediction of precipitation amount and frequency discussed earlier.

The diurnal pattern in Fig. 10 helps to understand the model bias in the diurnal variation amplitude over southern China and Sichuan Basin shown in Fig. 9. According to Eq. 1, the diurnal variation amplitude is determined by the maximum precipitation anomaly and the mean precipitation. In the Sichuan Basin, the model predicts a smaller maximum precipitation anomaly during the night (Fig. 10a), and a larger mean precipitation (Fig. 3c), both of which contribute to the diurnal variation amplitude underestimation therein. For southern China, the diurnal cycle plot (Fig. 10d) shows that the maximum precipitation anomaly in the afternoon is larger from the model, while the bias in the mean precipitation is not consistent (Fig. 3c). The end result is dominated by the over-simulated afternoon precipitation anomaly, and thus results in an over-prediction of diurnal variation amplitude.

3.4.3 Effect of propagating convection

Previous observational studies have documented the frequent occurrence of organized propagating convection (such as mesoscale convective systems, or MCSs) in the warm season, and its remarkable impact on the diurnal variation of precipitation over some continental regions: the contiguous US (Carbone et al. 2002), East Asia (Wang et al. 2004), Africa (Laing et al. 2008), and South Asia (Liu et al. 2008). Similar to the well-studied summertime convection east of the Rocky Mountains, diurnally-repeated daytime convection in response to strong surface heating of the elevated Plateau can organize into various long-lived mesoscale systems under favorable large-scale conditions such as the presence of adequate vertical wind shear, and then travels off the Plateau from the eastern flank (Bai et al. 2008). These sequences of organized convection cause precipitation episodes with lifetimes far exceeding those of individual convective systems (Wang et al. 2004), significantly affecting the spatio-temporal precipitation patterns and causing coherent diurnal variation patterns from the Plateau to the far downstream.

The time-longitude plots of diurnal precipitation anomaly (often referred to as Hovmöller diagrams) in Fig. 11 gives a closer look at how propagating convection modulates the regional diurnal variation along the Yangtze River valley and how well the convection-permitting simulation replicates it. Herein, composite hourly precipitation is first latitudinally averaged between 27° N and 32° N. Then the anomaly relative to the daily mean is calculated, with the highest positive anomaly corresponding to the daily precipitation peak at each longitude. In both observation and the model (Fig. 11), the precipitation anomaly indicates a coherent signature, commencing at the plateau periphery and traveling all the way to the lower Yangtze River valley. Specifically, the eastward-delayed timing of peak precipitation is characteristic of an evening precipitation maximum at the lee side, a nocturnal maximum over the upper Yangtze River valley (between 102° E and 108° E), and a morning maximum at the further downstream. This phase shifting is a clear manifestation of sequential propagating convection reported in Wang et al. (2004), comparable to the travelling diurnal signature in North America (Carbone et al. 2002). Of note is that as well as the morning precipitation linked to propagating convection, the lower reaches of the Yangtze River valley (east of 114° E) is dominated by synchronous rainfall peaking in the afternoon. It is also worth pointing out that the observed and modeled coherent phase transition convinces us of the foremost importance of propagating convection from the Plateau in accounting for the nocturnal precipitation east of the Plateau.

Fig. 11
figure 11

Time-longitude distributions of precipitation anomaly (mm/hour) relative to daily means from a model and b AWS-CMORPH observation. The precipitation diurnal cycle is averaged temporally over 10 warm seasons and spatially between 27° N and 32° N

In summary, the convection-permitting model captures the diurnal phase transition along the Yangtze River valley reasonably well, suggestive of its superior skill at representing propagating convection compared to GCMs relying on parameterized convection (Yuan 2013; Yuan et al. 2013). However, as noted earlier, the model overestimates the afternoon precipitation almost at all longitudes, giving rise to a false precipitation maximum in the afternoon between 105° E and 110° E.

4 Discussion

The motivation for dynamical downscaling is to improve the representation of topography, the precipitation diurnal cycle, the simulation of high-intensity precipitation events, and organized propagating convective systems. The preceding evaluation shows that the convection-permitting model we used capably fulfills these purposes and reasonably reproduces the major features of eastern China precipitation in the warm season. However, there exist an overarching problem of over-predicting precipitation amount and frequency, especially in the afternoon. The simulation by Li et al. (2018) using a convection-permitting version of the Met Office Unified Model showed a similar wet bias over eastern China. Zhu et al. (2018) also found an over-prediction of afternoon precipitation in southeast China in their daily forecasts using 4-km WRF model. As such, the high precipitation bias seems to be a common problem for convection-permitting simulations, not specific to the WRF model, and exists in both long-term regional climate simulations and weather-forecast simulations. Part of the bias may come from observational uncertainties as stated in Sect. 3.1, while part of it may come from model deficiency. Based on the setup of the simulation and the specific conditions in eastern China, we speculate at least two reasons for this model deficiency. Firstly, the 3-km grid spacing is not sufficient to resolve shallow convective clouds. This could lead to too much solar radiation reaching the surface and consequently too strong near-surface thermal instability in the afternoon. Secondly, China is one of the regions with the highest aerosol concentrations in the world, however the multiple effects of aerosols are not fully considered in the simulation design. In the following, we will present a preliminary analysis and discussion of these issues, but their thorough investigations are beyond the scope of the present study.

4.1 Effect of sub-grid clouds

Studies have found that GCMs tend to under-predict low-altitude cloud coverage (Ma et al. 2014). Diagnostic cloud fraction schemes are therefore used in GCMs to mitigate this problem. For regional models, although the dynamics and physics are more realistic, such under-prediction remains even with single-digit kilometer grid spacing (Eikenberg et al. 2015). The under-prediction of low clouds will lead to high-biased surface insolation and development of unrealistically strong thermal instability. The cloud fraction scheme in our model setup is the commonly used Xu and Randall (1996) scheme in the WRF community. This scheme is based on grid-scale condensate generated by the microphysics scheme and predicts zero cloud fraction when there is no grid-scale condensate. This will likely lead to an under-prediction of sub-grid clouds because sub-grid clouds can exist when the grid-scale relative humidity (RH) is below saturation (i.e., when there is no grid-scale condensate). Insufficient low clouds associated with the Xu–Randall scheme has been consistently reported in evaluations against satellite data (Cintineo et al. 2014; Thompson et al. 2016).

In an attempt to improve the cloud representation in the WRF model, Thompson et al. (2017) developed a new cloud fraction scheme. The scheme uses a RH-based approach following Sundqvist et al. (1989), with an additional grid spacing dependence for the critical RH threshold (RHcrit). Different RHcrit values are used over ocean versus land, as follows:

$$ RH_{crit} = \left\{ {\begin{array}{*{20}c} {0.65 + \sqrt {1/\left( {25 + \Delta_{x}^{3} } \right)} \quad over land} \\ {0.81 + \sqrt {1/\left( {50 + \Delta_{x}^{3} } \right)} \quad over ocean} \\ \end{array} } \right.. $$
(2)

Here Δx is grid spacing in km. Besides cloud fraction, the scheme also diagnoses sub-grid cloud condensates (i.e., liquid water content and ice water content) for use in radiation calculations. When the grid is sub-saturated, the RH-based scheme does not require the existence of grid-scale condensate and can predict the existence of sub-grid clouds when RH is above RHcrit. Therefore the new scheme generally increases low cloud amounts and reduces surface solar insolation. Comparison with U. S. Climate Reference Network (NSCRN) solar radiation measurements confirms this advantage over the Xu–Randall scheme (Thompson et al. 2017).

In order to assess the possible benefits of the new cloud fraction scheme, we conduct one warm-season test for the year 2008. The domain-averaged daytime vertical profile of cloud fraction (Fig. 12) shows that the Thompson et al. scheme increases cloud fraction at almost all altitudes from 900 hPa up to about 200 hPa, consistent with the testing over North America (Thompson et al. 2017). The cloud fraction increase occurs over most of the domain, including the TP and the Mei-yu region (figures not shown). In response to the widespread increase in cloud amount, the shortwave radiation reaching the surface reduces from 245.7 to 225.3 W/m2 for domain averages (Fig. 13). The resultant surface thermodynamic instability weakening leads to a reduction of the seasonal mean precipitation from the original 7.57 mm/day (Fig. 14b) to 6.83 mm/day (Fig. 14c), corresponding to a 9.8% precipitation decrease. The difference from CN05.1 observation decreases from 23% (Fig. 14e) to 10.9% (Fig. 14f). The result generally supports our hypothesis that the improper sub-grid cloud treatment in our simulations may partially contribute to the high precipitation bias. However, we should point out that due to the limit of computing resources and the scope of the study, the internal variability of the model is not assessed. A complete understanding of the significance of the sub-grid cloud effects, taking model internal variability into account, deserves further investigations and would be the focus of a separate study.

Fig. 12
figure 12

Vertical profile of daytime cloud fraction for the 2008 warm season from the Xu–Randall and the Thompson et al. scheme. Profile is averaged on model levels over the entire domain. The pressure corresponds to the domain mean at each model level

Fig. 13
figure 13

2008 warm-season (May–August) mean short-wave radiation flux at surface (W/m2) from a the Xu–Randall and b the Thompson et al. scheme. The numbers inside each plot are the domain-averaged values. The black-white contour lines show the terrain height in meters. From white to black the terrain heights are 100, 500, 1000, and 3000 m

Fig. 14
figure 14

2008 warm-season mean precipitation (mm/day) from a observation, b simulation with the Xu–Randall cloud fraction scheme, c simulation with the Thompson et al. cloud fraction scheme, and d simulation with the aerosol-aware microphysics scheme, and eg the percentage differences from observation. The numbers inside each plot are the domain-averaged values. The black–white contour lines show the terrain height in meters. From white to black the terrain heights are 100, 500, 1000, and 3000 m

4.2 Effect of aerosols

A distinct regional feature in East Asia is the high aerosol concentration (de Leeuw et al. 2018). On one hand, aerosols can affect solar radiation by absorbing or scattering sunlight (aerosol-radiation interactions, or ARI). On the other hand, aerosols acting as cloud condensation nuclei (CCN) and ice nuclei (IN) can affect cloud microphysical processes by altering cloud droplet and ice particle concentrations, and modulate precipitation efficiency (aerosol-cloud interactions, or ACI). According to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC 2013), the global radiative impact due to aerosol-radiation interactions is a negative (cooling) − 0.45 W/m2. The effects of aerosol-cloud interactions are more complex and uncertain, especially when ice nuclei are considered (Altaratz et al. 2014; Fan et al. 2016; Khain 2009; Seinfeld et al. 2016; Yun and Penner 2012, 2013). The net effect of ARI and ACI is assessed to be a negative − 0.9 W/m2 with medium confidence (IPCC 2013). Observational and modeling studies have found that aerosols can significantly affect long-term precipitation trends in China (Guo et al. 2017, 2018; Huang et al. 2007; Liu et al. 2016, 2019; Qian et al. 2009; Wang et al. 2011; Yang and Li 2014). Multi-year regional chemistry-climate model simulations for East Asia by Giorgi et al. (2003) showed that both aerosol direct effect through radiation and indirect effect through cloud interactions result in a cooling of the surface and a decrease in precipitation. With a similar model, Huang et al. (2007) showed that sulfate and carbonaceous aerosols reduce solar radiation and decrease surface temperature, while increase cloud liquid water through aerosol-cloud interactions. The end result is a 10% decrease of precipitation in fall and winter, and 5% decrease in spring and summer. Liu et al. (2016) found that anthropogenic aerosols reduce shortwave flux at surface, increase cloud droplet number concentration and decrease daily precipitation in most areas of East Asia.

In our model setup, the aerosol-radiation interactions are considered by using a 5° × 4° 1990s aerosol climatology in the RRTMG radiation scheme. The aerosol optical depth (AOD) is very small with an area average of 0.13 (Fig. 15), compared to satellite observation of East China in recent years (typical values > 0.5; de Leeuw et al. 2018). Therefore, the impact of aerosol-radiation interactions is undoubtedly underestimated in current simulations.

Fig. 15
figure 15

AOD of the climatological aerosol field used in the RRTMG radiation scheme in current model setup. The number inside the plot is the domain-averaged value. The black–white contour lines show the terrain height in meters. From white to black the terrain heights are 100, 500, 1000, and 3000 m

In addition to the improper treatment of aerosol-radiation interactions, the aerosol-cloud interactions are completely missing in the simulations, because the employed microphysics scheme uses a constant number of cloud droplets. To quantify the impact of poorly treated ARI and ACI, we carry out a sensitivity simulation with the Thompson aerosol-aware microphysics scheme (Thompson and Eidhammer 2014). This scheme parameterizes an aerosol emission flux using near-surface aerosol concentration and surface wind. Hygroscopic particles are combined together into a water-friendly aerosol species, while dust particles are accumulated into an ice-friendly aerosol species. The effects of aerosols on radiation are added to the radiation scheme. Cloud droplet nucleation from aerosols is included using a lookup table. Ice nucleation on dust is treated following DeMott et al. (2010) and Phillips et al. (2008). So both aerosol-radiation and aerosol-cloud interactions are taken into account in this scheme. The deficiency of the scheme is apparent, due to its oversimplified aerosol emission treatment, and monthly varying aerosol climatology. Despite these deficiencies, the scheme has advantage for long-term simulations because of the limited additional computational cost.

One season simulation is conducted with the aerosol-aware microphysics scheme for the warm-season of 2008. The result shows a reduction of wet bias from 23.1% (Fig. 14e) to 17.8% (Fig. 14g). However, it should be pointed out that this is only a preliminary test with highly simplified aerosol emissions. Future study with more realistic aerosol emissions and taking internal variability into account is needed to further investigate the effect of aerosols on eastern China precipitation, including the relative importance of aerosol-radiation and aerosol-cloud interactions.

Taken together, the missing clouds, the low AOD, and the absent aerosol-cloud interactions could be collectively responsible for the high bias in model precipitation. Improved representations of these processes will potentially improve the model performance greatly.

5 Summary

The warm-season precipitation in China has many unique features associated with complex interactions among the East Asian summer monsoon, ocean temperature, orographic forcing, and anthropogenic changes. Its simulation has been a tremendous challenge for both global and regional climate models, largely due to deficient cumulus parameterizations. Convection-permitting simulations have been shown to improve regional climate simulation and projection over many regions of the world. However, their performance in simulating East Asia precipitation has not been robustly tested. In this study, 10 warm-season (2008–2017) simulations are performed with the convection-permitting WRF model over eastern China. The model uses 3-km grid spacing and covers major topography and precipitation regions in China. The simulation results are compared to station observations, gridded station data, and gauge-satellite merged observational data. The results are summarized as following:

  • The model faithfully reproduces the spatial distribution of seasonal and sub-seasonal precipitation, as well as the inter-annual variations of precipitation.

  • The distribution patterns of precipitation intensity and frequency are well simulated, but the values are high biased, particularly for heavy rain days.

  • The meridional migration of rain band following the seasonal advance and retreat of the East Asian monsoon flow and the west North Pacific subtropical high is successfully reproduced. The migration span, duration, and timing are all comparable to observations.

  • The model shows good skill in simulating the regional patterns of precipitation diurnal cycle and the phase transition from west to east, but tends to over-predict the afternoon precipitation. The modeled diurnal amplitude agrees with observations over most of the domain, with exceptions of an overestimation in southern China and underestimation along the west side of the Basin.

  • Propagating convection east of the Plateau is well simulated, as evinced by the diurnal phase variation from the Plateau through the Yangtze River valley. This confirms the value of convection-permitting models in simulating organized convection and coherent diurnal signal.

  • The model overestimates precipitation amount up to about 20%, mostly over the Tibetan and Yun-Gui Plateaus, the southern mountains, and the North China Plain. The overestimation of precipitation amount is mainly resultant from the overestimated precipitation frequency, in particular heavy rain occurrence.

This study is the first multi-year convection-permitting simulation of warm-season precipitation in a geographical region under the East Asian monsoon climate. The demonstrated value of convection-permitting simulations suggests their promising application to studying regional climate changes and impacts. The proposed mitigation of the precipitation bias provides a pathway for future model development and a useful guidance for simulation design. Several physical processes, which are either poorly represented or completely absent in the model physics configuration, are suggested for an interpretation of the excessive model precipitation. They include the under-representation of sub-grid clouds, such as shallow convective clouds, the under-representation of aerosol-radiation interactions and the missing aerosol-cloud interactions. These speculations are tested by sensitivity simulations performed with a newly developed RH-based cloud fraction parameterization and with an aerosol-aware microphysics scheme that accounts for both aerosol-radiation and aerosol-cloud interactions. However, we want to emphasize that our analysis of these model physics is quite preliminary, and an in-depth examination is much needed to fully understand the role of sub-grid clouds and aerosols in modulating the warm-season precipitation in the East Asian monsoon climate.