Simulation of summer precipitation diurnal cycles over the Tibetan Plateau at the gray-zone grid spacing for cumulus parameterization

The Tibetan Plateau (TP) is often referred to as the “water tower of Asia” or the “Third Pole”. It remains a challenge for most global and regional models to realistically simulate precipitation, especially its diurnal cycles, over the TP. This study focuses on evaluating the summer (June–August) precipitation diurnal cycles over the TP simulated by the Weather Research and Forecasting (WRF) model. The horizontal resolution used in this study is 9 km, which is within the gray-zone grid spacing that a cumulus parameterization scheme (CU) may or may not be used. We conducted WRF simulations with different cumulus schemes (CU experiments) and a simulation without CU (No_CU experiment). The selected CUs include the Grell-3D Ensemble (Grell), New Simplified Arakawa-Schubert (NSAS), and Multiscale Kain-Fritsch (MSKF). These simulations are compared with both the in-situ observations and satellite products. Results show that the scale-aware MSKF outperforms the other CUs in simulating precipitation in terms of both the mean intensity and diurnal cycles. In addition, the peak time of precipitation intensity is better captured by all the CU experiments than by the No_CU experiment. However, all the CU experiments tend to overestimate the mean precipitation and simulate an earlier peak of precipitation frequency when compared to observations. The frequencies and initiation timings for short-duration (1–3 h) and long-duration (> 6 h) precipitation events are well captured by the No_CU experiment, while these features are poorly reproduced by the CU experiments. The results demonstrate simulation without a CU outperforms those with a CU at the gray-zone spatial resolution in regard to the precipitation diurnal cycles.


Introduction
As a fundamental cycle in the Earth's climate system, diurnal cycles of precipitation considerably affect surface radiation, temperature, and in particular the surface hydrology (Dai et al. 1999a, b). The diurnal cycle of precipitation is driven by solar forcing and affected by complicated interactions between the atmosphere and surface processes, where convection plays an important role (Jeong et al. 2011;Mooney et al. 2017;Zhang et al. 2017b;Zhou et al. 2008).
Due to the small size of convective clouds, only very fine resolution (less than a few kilometers) models can explicitly resolve the convective-scale processes. Most global and regional climate models, however, rely on cumulus schemes (CUs) to parameterize convective activities. Several CUs for weather/climate models have been developed, which have different assumptions on cloud feedbacks such as the formation of cirrus clouds (e.g. Arakawa 2004;McFarlane 2011;Sun and Bi 2019). Spatial characteristics are usually Fuqing Zhang: Deceased 19 July 2019. linked with horizontal resolution in numerical models; as a result, model simulations of the diurnal cycles of precipitation are significantly influenced by the horizontal resolution and individual CU (Mooney et al. 2016;Sugimoto and Takahashi 2016;Walther et al. 2013;Yuan et al. 2013;Zhang and Chen 2016). Therefore, the diurnal cycle of precipitation provides an excellent perspective to evaluate model physics in weather forecast and climate modeling with different horizontal resolutions (e.g. Dai and Trenberth 2004).
The Tibetan Plateau (TP), also called the Third Pole, is the highest and most extensive upland region in the world. It is also considered the "water tower of Asia". Hence, more accurate forecasting of precipitation over the TP is of critical significance for managing water resources in Asia. A clear diurnal cycle of summer precipitation over the TP, with a peak in between late afternoon and midnight, has been reported (Chen et al. 2012;Guo et al. 2014;Li 2018) but has not yet been well reproduced by most global climate models (e.g. Yuan et al. 2013;Zhang and Chen 2016) as well as most regional model (RM) simulations driven by global reanalyses (e.g. Li et al. 2018;Xu et al. 2012). This is mainly reflected in an overestimated precipitation amount (e.g. Gao et al. 2015;Su et al. 2013) and an incorrect precipitation peak hour (e.g. Chow and Chan 2009;Xu et al. 2012;Yang et al. 2018;Yuan et al. 2013). The coarse grid spacing is considered one of the major reasons for the wet bias in most climate models and reanalyses over the TP (e.g. Lin et al. 2018), and limits the ability of climate models to simulate turbulent orographic form drag over the TP's complex terrain . A high-resolution RM (HiRM: horizontal resolution 1-10 km), on the other hand, has shown its added value by reducing the wet bias over the TP where the topography is complex (e.g. Lin et al. 2018). However, it remains a challenge for most RMs to capture the diurnal cycles of precipitation and associated complex multi-scale interactions over the TP, especially during summer (e.g. Sugimoto and Takahashi 2016;Yang et al. 2018).
The realistic simulation of summer precipitation over the TP, especially its diurnal cycles, remains an unresolved issue in most regional and global models (e.g. Sugimoto and Takahashi 2016). Simulations of the diurnal cycles of precipitation can be dramatically influenced by the CU employed in HiRMs (e.g. He et al. 2015;Sugimoto and Takahashi 2016). Prein et al. (2015) have shown that the physical justification for the application of a CU starts to break down when grid spacing becomes smaller than approximately 10 km. The grid spacing between 10 and 4 km is the so-called gray-zone in which individual convective cells cannot be resolved but the organized mesoscale convective systems can be resolved. The gray-zone resolution is currently the highest model resolution most climate modeling groups can afford to achieve for RMs (Chen et al. 2018a;Prein et al. 2015;Wang et al. 2015). Many previous studies have demonstrated that the use of CUs may introduce systematic biases in simulating precipitation climatology over South Asia and the TP at the gray-zone resolution, while simulations without a CU (No_CU) at the same grid spacing yield a reasonable representation of summer mean precipitation (e.g. Chen et al. 2018b;Mukhopadhyay et al. 2010;Sugimoto and Takahashi 2016). The sensitivity of the simulated diurnal cycles of summer precipitation over the TP to different CUs at the gray-zone grid spacing has not yet been systematically evaluated. It is not clear whether No_CU can accurately capture the characteristics of the summer diurnal cycle of precipitation over the TP at the gray-zone grid spacing or not. In addition, CUs tend to simulate a higher precipitation frequency than that of the convection-permitting experiments without a CU (e.g. Sugimoto and Takahashi 2016). This may be because the convective adjustment time scale (also known as convective relaxation time scale), one of the many key parameters in convective parameterization schemes, is set as a constant value for most available CUs (Mishra and Srinivasan 2010;Zheng et al. 2016). The adjustment time scale is the time over which convective available potential energy (CAPE) is reduced to stabilize the atmosphere (Fritsch and Chappell 1980). The ratio of parameterized subgrid-scale precipitation to total precipitation decreases with increase in the adjustment time scale (Done et al. 2006). The impacts of parameterized convection on total precipitation should become less significant when moving from coarser (~ 15 km) to highresolution (~ 1 km). Hence, the adjustment time scale should increase with grid resolution such that atmospheric stability restoration is gradually taken over by the resolved convective processes (Zheng et al. 2016). To this end, a horizontal grid spacing dependent dynamic convective adjustment time scale is adopted in the Multiscale Kain-Fritsch Scheme (MSKF) (Zheng et al. 2016). An interesting question to ask is whether a scale-aware CU, such as the MSKF is better than non-scaleaware CUs or simply No_CU when simulating the diurnal cycles of summer precipitation over the TP at the gray-zone grid spacing?
The aim of the present work is to investigate the impact of CUs on the simulations of summer precipitation diurnal cycles over the TP at the gray-zone grid spacing. We use the following CU options: No_CU, MSKF, the Grell 3D Ensemble Scheme (Grell) (Grell and Dévényi 2002), and the New Simplified Arakawa-Schubert Scheme (NSAS) (Han and Pan 2011). Among them, Grell and NSAS are non-scale-aware but have shown good performance in simulating summer precipitation over Asia (e.g. Ganai et al. 2016;Yang et al. 2019).

Model configuration
The Weather Research and Forecasting (WRF, version 3.7.1) model in the non-hydrostatic configuration is used to dynamically downscale global reanalysis with a focus on the TP. The New Goddard short-wave radiation scheme (Chou and Suarez 1999), RRTMG Long-wave radiation scheme (Iacono et al. 2008), WRF Double Moment 6-class Microphysics Parameterization (Lim and Hong 2010), Yonsei University (YSU) PBL scheme (Hong et al. 2006), and Unified Noah Land Surface Model (Tewari et al. 2004) are used. Since June 2018, the Integrated Forecasting System (IFS) of the European Centre for Medium-Range Weather Forecasts (ECMWF) has provided a high-resolution forecast (HRES) at 9 km grid spacing. Our study follows ECMWF's HRES and the methods proposed by Wang et al. (2015) and Chen et al. (2018a, b, c), who show that regional models with a 9 km resolution are able to realistically capture the statistical characteristics of precipitation over the tropics and extratropics.
The model domain ( Fig. 1) is configured at 9 km grid spacing and centered at 31.5°N, 87.5°E, with 420 grid points in the east-west direction and 250 grid points in the north-south direction. There are 60 eta levels, with the model top at 10 hPa.
The choice of global reanalysis input may also affect the RM when simulating the diurnal cycles of precipitation over the TP. As Bao and Zhang (2013) have noted, the ERA-Interim (ERAI) (Dee et al. 2011) is the global reanalysis that shows close agreement with independent radiosonde data over the TP. The ERAI has been used in many RM experiments across the TP. However, simulated precipitation driven by the ERAI shows a clear wet bias over the TP in most RMs (e.g. Huang and Gao 2018;, because the ERAI has a wet bias in precipztable water ). In July 2017, ECMWF began publishing their fifth-generation global reanalysis product, ERA5. Compared to the ERAI, the ERA5 has assimilated more observations (e.g. satellite-derived radiance over cloudy areas) (Hersbach et al. 2018). The ERA5 also has a higher spatial-temporal resolution (31 km and hourly interval) compared to the ERAI (79 km and six hourly interval). In this study, the ERA5 provides the initial and boundary conditions for all the HiRM experiments over the TP. We use three hourly data from the ERA5. In addition to No_CU, three CUs, namely MSKF, Grell, and NSAS, are tested to assess the impact of CUs on the simulated diurnal cycles of summer precipitation over the TP.
Spectral nudging (Waldron et al. 1996), which can be regarded as an indirect data assimilation method (von Storch et al. 2000), is used in this work. The spectral nudging consists of adding a new term to the tendencies of the model variables Q (to be nudged) that relaxes the selected part of the spectrum to the corresponding waves from driving fields Q 0 (refers to the ERA5 in this work) (Miguez-Macho et al. 2004). The new term to be added is a function of the difference fields (Q-Q 0 ), in which the spectral decomposition (based on Fourier expansion) is performed on the difference fields. The difference fields are quasi-periodic, since they are always close to zero along the boundaries. The relaxation term, with only the coefficients for the selected part of the Fig. 1 Map of the WRF domain with terrain elevations shaded at 500 m intervals, showing the Tibetan Plateau (TP) and locations of the surface meteorological stations (shown as triangles: red hollow triangles show those stations with elevations higher than mean elevation of the surrounding 9 WRF grids, i.e. the local 27 × 27 km region around the station; blue solid triangles show those stations with elevations higher than mean elevation of the 1 WRF grid in which the station is located, i.e. the local 9 × 9 km region around the station; remaining stations are shown as blue hollow triangles). See Gao et al. (2015) for more detailed information on the stations. The black line is the 2000 m contour spectrum, is transformed from wave space to physical space and added to the tendency of Q. Only the same part of the spectrum of variable Q will be affected by the relaxation since the functions of the Fourier expansion are orthogonal. The spectral nudging can reduce the impact of model domain size on the regional simulation, thus preventing the simulation from drifting away from large-scale driving fields (Hong and Kanamitsu 2014).
This work follows a similar approach over North America by Liu et al. (2017a). Specific details of the spectral nudging are as follows: (1) nudged variables comprise geopotential, horizontal wind and temperature; (2) a common nudging coefficient (0.0003 s -1 ) is used for all variables to adjust the strength of the nudging force in the governing equations (e.g. Otte et al. 2012); (3) the nudging is applied to levels above the approximate PBL top, with a magnitude increasing linearly to the full amount at the 5th level above the PBL; and (4) the wavenumber truncations are 2 and 1, corresponding to cut-off wave lengths of about 1890 and 2250 km (above which the spectral nudging is applied), in the zonal and meridional directions, respectively.
The WRF model experiments are run continuously from April 25th to September 1st for the year 2014 but only the output of the boreal summer monsoon season (June-August) is analyzed.

Data sets
Model simulations are evaluated by comparing with hourly precipitation at 83 surface meteorological stations over the TP during summer 2014, provided by the China Meteorological Administration (CMA) (Fig. 1). Details of the 83 meteorological stations are provided by Gao et al. (2015). However, the density of in-situ stations is sparse over the TP, with most available stations located in valleys (e.g. Li 2018). To better understand the observed diurnal cycles over the entire TP, satellite precipitation products are also used, including the bias-corrected half-hourly precipitation from the Climate Prediction Center morphing method (CMORPH) (Joyce et al. 2004;Xie et al. 2017) with a horizontal resolution of 8 km and half-hourly precipitation from the Global Precipitation Measurement (GPM) (Hou et al. 2014;Liu et al. 2017b) version V05B with a horizontal resolution of 0.1° (~ 10 km). The half-hourly precipitation is aggregated to hourly precipitation and then interpolated onto the WRF grids using the nearest-neighbor interpolation Satellite precipitation products such as GPM and CMORPH, based on passive microwave and IR sensors, tend to have difficulty in detecting shallow orographic precipitation and light rain events (e.g. Sohn et al. 2010;Wei et al. 2018). This will affect the reliability of satellite precipitation products over the TP. The 83 grids collocated with the stations for CMORPH and GPM are utilized when comparing between the satellite and in-situ observations. The GPM precipitation product started in the year 2014 and the hourly gauge observation is also available for the year 2014; the availability of both datasets is the reason for choosing the year 2014 to be investigated in this study.

Diurnal cycles
The simulated hourly precipitation amount, frequency, and intensity are systematically compared to observations. The hourly precipitation amount is defined as the seasonal mean of accumulated precipitation amount during each hour in the summer season (June-August). The precipitation frequency for a given hour of day is defined as the percentage of days on which there is precipitation (≥ 0.1 mm h -1 which is the minimum amount of measurable precipitation for the in-situ observations) during that given hour of day. The precipitation intensity is defined as the precipitation amount for a given hour of day, averaged over all days during July-August when there is precipitation (≥ 0.1 mm h -1 ) during that given hour of day.
The peak precipitation time (phase) of precipitation amount, frequency, and intensity is calculated using the harmonic analysis (Wilks 2006). The diurnal variation of hourly averaged precipitation P (averaged for the whole summer, as is the case for precipitation amount, frequency, and intensity) at hour h is represented by the summation of sinusoidal harmonics as where P is the 24-h mean of P , k is the harmonic number (i.e., 1 for the 24-h cycle, 2 for the 12-h cycle, etc.), and C k and k are the amplitude and phase of the k-th harmonic. For more detailed information on this procedure, please refer to Wilks (2006) and Jeong et al. (2011). The sum of the first two harmonics (k = 1 and k = 2) is defined as the diurnal cycle of P and the peak time is the time corresponding to the maximum value of P.

Precipitation events
Following Li (2018) and Yu et al. (2007), precipitation events are classified according to their duration with measurable precipitation (≥ 0.1 mm h -1 ). Each precipitation event can have a maximum of one 1-h gap (i.e. where hourly precipitation is less than 0.1 mm). The duration of the precipitation event is the number of hours from the beginning to the end of the event. 1-3 h is classified as short-duration; 4-6 h as medium-duration; and 7 h or longer as long-duration.

Simulated summer mean precipitation
In this section, summer mean precipitation from the ERA5 and all the four WRF experiments are evaluated against satellite and gauge observations. The spatial distribution  (Grell), and WRF (NSAS)). The difference between GPM and CMORPH is also shown in (b) of summer mean precipitation is shown in Fig. 2. Generally, the two satellite precipitation estimates agree with each other quite well, except for the central TP where GPM is 1-3 (mm/day) wetter than CMORPH on average (Fig. 2b). Differences larger than 7 mm/day can also be found over the central TP. This may be due to the higher uncertainty of the precipitation retrievals over the in-land water bodies (i.e., lakes over the TP) by the CMORPH since they match closely with lakes there. The mean of CMORPH and GPM averaged over the whole TP agrees well with the gauge observation (Fig. 3) and is therefore adopted as a reference for further evaluations of the model simulations with and without a CU, while the differences between satellite and gauge data are taken to indicate the uncertainty in the satellite-based reference.
Compared to the mean of the satellite data (Fig. 2a), the ERA5 agrees quite well with the satellite observations in terms of the precipitation spatial distribution, with small root mean square error (RMSE; 3.13 mm against the mean of two satellite precipitation estimates) (Fig. 2c). However, the ERA5 systematically overestimates precipitation across the whole TP. Overall, the spatial pattern of summer mean precipitation is well reproduced by the WRF HiRMs ( Fig. 2d-g), especially by the No_CU, MSKF, and NSAS experiments ( Fig. 2d-g). This may be a benefit from improved representation of the precipitation processes related to local topography when using the finer horizontal resolution (e.g. Sugimoto and Takahashi 2016;Walther et al. 2013). The results from the NSAS experiment show the lowest RMSE (2.94 mm against the mean of two satellite precipitation estimates) comparing to the satellite products (Fig. 2g). This is mostly reflected in the well captured mean precipitation over the southern TP by the NSAS experiment. The spatial distribution of simulated summer precipitation is better captured by the No_CU experiment than those of the MSKF and the Grell (Fig. 2d-f). When compared to both the in-situ and satellite observations over 83 stations/ grids (collocated with stations), the No_CU experiment has the lowest RMSE (2.43 mm) against the in-situ observation among all the WRF HiRMs, as shown in the Taylor (2001) Fig. 3a). This indicates that both the No_CU and NSAS have their advantages in simulating the spatial distribution of summer mean precipitation.
When averaged over the whole TP (Fig. 3b), summer mean precipitation is well reproduced by the No_CU. All the HiRM simulations (with different CUs, including the scale-aware CU MSKF) tend to overestimate summer precipitation when compared to the observations. This pattern is consistent with previous works (e.g. Chen et al. 2018b;Mukhopadhyay et al. 2010;Sugimoto and Takahashi 2016), which demonstrates the advantage of the No_CU experiment in simulating the summer mean precipitation over the TP, compared to the CU experiments.

Simulated diurnal cycles
Observed and simulated peak precipitation times in summer are compared in Figs. 4, 5, 6. Figure 4 presents the hour of day when most precipitation falls; Fig. 5 shows the hour of day when precipitation is most frequent; Fig. 6 displays the hour of day when precipitation is most intense. Generally, the spatial patterns of the diurnal peak time of precipitation from the two satellite products (CMOPRH and GPM) agree relatively well with each other and with the ground observations (Figs. 4a-c, 5a-c, 6a-c). However, in some regions the difference in diurnal peak time between CMO-PRH and GPM can be as long as 12 h, especially over the northern dry region where satellite products are less reliable (e.g. Wei et al. 2018). The disagreement may be due to the different satellites and algorithms used in deriving the products (Hou et al. 2014;Joyce et al. 2004). Figure 5 shows the spatial distributions of peak hour of precipitation frequency. The main discrepancies between the gauge and satellite observations are over the Qaidam Basin (northeastern TP), where the in-situ observations show a frequency peak in the morning while satellite observations show peaks at late night (Fig. 5a, b). This may be due to the inability of satellite products to detect light rain events (e.g. Sohn et al. 2010) over the dry Qaidam Basin. A larger difference between the two satellite products is found in the peak time of precipitation intensity (Fig. 6c) rather than in precipitation amount or frequency (Figs. 4c, 5c).
The averaged peak times of precipitation amount and frequency recorded by the rain gauges over the whole TP are later than those shown by the satellite observations (Fig. 7). Similar results have also been found by using different satellite precipitation products over the TP (Chen et al. 2012;Yang et al. 2018;Zhou et al. 2008), which may be linked to the impacts of topography and landscape on the precipitation diurnal cycles. The diurnal cycles of precipitation tend to peak in the daytime (evening-to-nocturnal) for stations on mountains (in valleys) (Chen et al. 2012;Guo et al. 2014). However, satellite observations face difficulties in representing the precipitation gradient in the direction normal to the orography (Derin and Yilmaz 2014). The diurnal cycles of satellite-derived precipitation tend to resemble the diurnal cycles observed by mountain gauge stations (e.g. Chen et al. 2012). These lead to an earlier diurnal peak in precipitation amount and frequency in the satellite data when compared to the gauge observations, for most of the gauge stations are located in valleys (e.g. Li 2018). The diurnal peak of precipitation is generally well captured by the ERA5 (Figs. 4d,  5d, 6d). However, the ERA5 tends to show an earlier peak of precipitation, especially for precipitation amount and frequency.
Simulated diurnal cycles in the HiRMs with three different CUs and with No_CU are compared in Figs. 4, 5, 6 and 7. All the HiRM experiments tend to simulate earlier peaks in precipitation amount and frequency in the south of the TP, when compared with the satellite observations (Figs. 4,5). This is especially true for the Grell and NSAS experiments. The peak of precipitation amount and frequency is well represented by the MSKF experiment and the No_CU experiment, with similar RMSEs. However, the peak hour of precipitation amount simulated by the MSKF experiment is about 3-6 h later than that of the satellite observations over a large part of the TP (Fig. 4e, f). A systematic later (early) peak in precipitation frequency simulated by the MSKF experiment is seen in the west (southeast) of the TP, while a generally later peak is found in the No_CU experiment than in the satellite observations (Fig. 5e, f). Interestingly, the simulated peak time of precipitation frequency over the Qaidam Basin in all the HiRM experiments agrees better with the gauge observations than with the satellite observations. The HiRM experiments and the gauge observations show morning peaks, while satellite observations give midnight peaks (Fig. 5). A similar pattern is also found in the diurnal peak time of precipitation amount (Fig. 4). This confirms the inability of the satellite precipitation data to capture precipitation dynamics over the dry region (e.g. Wei et al. 2018). At the same time, it shows the ability of the HiRM to simulate precipitation over the TP. The No_CU experiment tends to simulate an earlier peak of precipitation intensity while the CU experiments generally simulate a later peak, when compared with the satellite observations (Fig. 6). The spatial distribution of the peak hour of precipitation intensity in all the CU experiments, especially the MSKF experiment, agrees better with the satellite observations when compared to that of the No_CU experiment.
When averaged over all the 83 stations or station-collocated grids, the diurnal cycles of precipitation amount and frequency in the two non-scale-aware CU experiments are a few hours earlier than those in the satellite observations (Fig. 7). The average diurnal peak of precipitation amount and frequency from the ERA5 is similar to that of the two non-scale-aware CUs. The diurnal cycles of precipitation amount in the MSKF and No_CU experiments are similar to the satellite observations (Fig. 7a); this may explain why both the No_CU and MSKF experiments can better reproduce the mean precipitation over the TP than the two non-scale-aware CUs (Fig. 3). The No_CU experiment agrees better with the satellite observations in terms of the peak hour of precipitation frequency than that of the MSKF experiment. The averaged peak hour of precipitation frequency in the MSKF experiment is about 3 h earlier than that of the satellite observations (Fig. 7b). The averaged peak hour of precipitation intensity is better captured by the CU experiments than by the No_CU experiment when compared against both the satellite and gauge observations (Fig. 7c), indicating that the CU configuration is important for the simulation of extreme precipitation over the TP (e.g. Han et al. 2016).
There are some differences in the peak hour of precipitation amount and frequency between the satellite products and gauge observations; in particular, the satellite products tend to show a double peak (at 16:00 and 22:00 local standard time (hereafter LST)) while the gauge observations show a single peak (at 23:00 LST). A similar feature has also been found in the Eastern TP by Zhou et al. (2008), in which a single peak is shown in gauge observations while a double peak can be found in the hourly Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) satellite product.
The simulation of diurnal cycles over the TP may be further improved when using finer grid spacing, such as that in convection-permitting modeling (CPM) with a horizontal resolution smaller than 4 km, as shown by Sato et al. (2008). However, as noted by Sugimoto and Takahashi (2016), the diurnal characteristics of precipitation are more difficult to be simulated during the mature monsoon season (July and August) than during the pre-monsoon season (April and May). Some CPMs poorly reproduce the diurnal peak hour of precipitation amount; for example, Xu et al. (2012) shows that a 3-km-resolution CPM simulated a later peak of precipitation amount than that of a satellite product over the central TP. Further studies are required to assess the added value of CPMs in simulating diurnal cycles of precipitation over the TP.
Generally speaking, the mean total precipitation, as well as the diurnal cycles of precipitation amount and frequency, is better captured by the No_CU experiment than the three CU experiments. However, the diurnal cycles of precipitation intensity are better reproduced by the CU experiments, especially by the MSKF experiment. The performance of the scale-aware CU (MSKF) experiment is better than those of the two non-scale-aware CU experiments in terms of the mean amount and diurnal cycle of precipitation.

Simulated precipitation events
To further assess the capability of WRF in simulating the diurnal cycle of precipitation over the TP, simulated precipitation events are compared with the satellite and gauge observations in Figs. 8 and 9. The majority of precipitation events over the TP are short-duration (1-3 h), as seen from both the satellite and gauge observations. This feature is clearly captured by No_CU (Fig. 8a) but underestimated by the ERA5 and all the HiRMs with a CU. Most of the total precipitation comes from long-duration precipitation events, as shown by both the satellite and gauge observations (Fig. 8b); however, all the CU experiments and the ERA5 yield higher contributions of long-duration precipitation events when compared to both the satellite and gauge observations, leading to the overestimates of precipitation associated with long-duration events.  Fig. 9 The diurnal distribution of start-hour frequency for a short-(1-3 h), b middle-(4-6 h), and c long-(> 6 h) duration precipitation evens. The plots are based on the in-situ observation, GPM/ CMORPH, the global reanalysis ERA5, and the three WRF simula-tions driven by ERA5 with different CUs (Grell, MSKF, and NSAS) as well as the one without CU (averaged over the 83 stations or station-collocated grids) The most common start hour of the short-duration precipitation events is in the afternoon, according to both the satellite and gauge observations (Fig. 9a). Interestingly, the frequency and peak time of short-duration precipitation events are well reproduced by all the HiRM experiments (Fig. 9a). As noted by Yu et al. (2007), the diurnal cycles of short duration precipitation events are closely linked to the diurnal cycles of surface solar heating. The linkages here indicate that surface solar heating is well represented by all the HiRM experiments. The spatial distribution of the peak hour of precipitation amount for short-duration precipitation events is relatively well captured by all the HiRM experiments, especially the No_CU experiment (Fig. 10). The afternoon peak of the short-duration precipitation is also partly produced by the ERA5 over the northern TP, but is about 6 h later than the satellite observations over the rest of the TP.
In the satellite estimated precipitation, the most common start hour of both middle-duration (4-6 h) and long-duration (longer than 6 h) precipitation events is in the afternoon. This feature is clearly reproduced by the No_CU experiment (Fig. 9b, c). However, the most common start hour of middle-duration and long-duration precipitation events in the CU experiments is earlier than both the observations and the No_CU experiments. This contributes to the earlier peak time of precipitation frequency simulated by the CU experiments (Fig. 6).
The frequency of both middle-duration and long-duration precipitation events is generally overestimated by the CU experiments and the ERA5, but relatively well reproduced in the No_CU experiment (Figs. 8,9). Further, the spatial distribution of the peak hour of precipitation amount in the long-duration precipitation events is relatively well captured by both the No_CU and MSKF experiments (Fig. 11), with an earlier peak in precipitation amount simulated by the two non-scale-aware CU experiments and the ERA5.
All the CU experiments significantly overestimated the duration of light precipitation (hourly precipitation intensity between 0.1 and 1 mm; Fig. 12), leading to an overestimated frequency of the long-duration precipitation events shown in Fig. 9. The overestimated durations may be caused by the early onset of convective precipitation, resulting in an excessive release of convective instability in the CU experiments (e.g. Betts and Jakob 2002;Sugimoto and Takahashi 2016). Atmospheric models tend to release convective instability too readily, so that convection is triggered too frequently  Fig. 4, but for total precipitation by short-duration (1-3 h) precipitation events (the precipitation amount is the sum of all precipitation belonging to the short-duration precipitation events) in simulations with a CU (Ma et al. 2013;Trenberth et al. 2017). This can result in an overestimated frequency of lowto intermediate-intensity precipitation events. Li (2018) demonstrated that the midnight peak of precipitation over the TP is mainly attributed to long-duration precipitation events. In our case, the spatial pattern of the peak time of long-duration precipitation events is well captured by the No_CU and MSKF experiments. An earlier peak time is simulated in the non-scale aware CU experiments, resulting in the earlier (by about 3 h on average as seen from Fig. 7) peak of precipitation amount in these two experiments (Fig. 4).
Individual CUs have different influences on the vertical transport of sensible heat (SH) and the formation of clouds (e.g. Arakawa 2004;McFarlane 2011;Sun and Bi 2019), which in turn affect the near surface SH. The near surface SH and related upward motion may further influence the model's ability to realistically simulate precipitation (e.g. Li et al. 2018). At the same time, implementation of different CU affects water vapor transport in models (e.g. Yu et al. 2011), which may further influence the simulated precipitation duration (Li 2018).  Fig. 12 The number of precipitation hours binned by intensity based on the in-situ observations, GPM/CMORPH, the global reanalysis ERA5, and the three WRF simulations driven by ERA5 with different CUs (Grell, MSKF, and NSAS) as well as the one without CU for all precipitation events (averaged over the 83 stations or stationcollocated grids) The No_CU experiment well represents the frequencies of both light-and long-duration precipitation events, and their contributions to total precipitation. A clear overestimation of the mean precipitation in the MSKF experiment is caused by the overestimated frequency of both light-and long-duration precipitation events. This indicates that adopting No_CU to simulate precipitation may further improve the performance of other regional models forced by the simulated precipitation, such as hydrological models for which accurately-simulated precipitation events are critical.
The inter-annual variability in the diurnal cycles of summer precipitation is small ; thus, the current work can be considered as representative of the general characteristics of the summer diurnal cycle of precipitation over the TP even though only one summer is analyzed.

Concluding remarks
In the current work, the simulated mean and diurnal cycles of summer precipitation using different CU options have been assessed by comparing with in-situ and satellite precipitation observations. Factors that may be important to the simulated diurnal cycles of precipitation have also been proposed and discussed.
The No_CU experiment is able to fairly realistically reproduce the mean precipitation over the TP, because it outperforms the experiments with different CUs in terms of both mean and diurnal cycle of precipitation amount and frequency. However, the averaged peak hour of precipitation intensity is more accurately captured by the CU experiments than by the No_CU experiment.
The No_CU experiment also accurately captures the frequency of short-duration (1-3 h) and long-duration (> 6 h) precipitation events and their contributions to total precipitation, when compared with both the satellite and gauge observations. The frequency of the long-duration precipitation and its contribution to total precipitation are overestimated by all the CU experiments. The most common start hour of short-duration precipitation events is well reproduced by all the experiments, but only the No_CU experiment captures the most common start hour of long-duration precipitation events.
The scale-aware CU (MSKF) outperforms the other nonscale-aware CUs (Grell and NSAS) in capturing mean and diurnal cycles of precipitation. The MSKF may have an advantage over No_CU in simulating extreme precipitation events over the TP in the gray-zone. Due to the short period of the simulation, however, this needs further assessment in future works.
Acknowledgements Open access funding provided by University of Gothenburg. Thanks to the European Centre for Medium-Range Weather Forecasts (ECMWF) for providing the ERA5 and ERAI reanalysis data, especially to Dr. Hans Hersbach from ECMWF for his help in getting the ERA5 reanalysis products (https ://cds.clima te.coper nicus .eu/). The computations were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) at the National Supercomputer Centre in Sweden (NSC) partially funded by the Swedish Research Council through grant agreement no. 2016/5-20, 2017/12-12, 2018/2-11, 2019/2-8, 2019/2-11. This is a contribution no 1 to CORDEX-FPS-CPTP. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.