1 Introduction

Heavy persistent precipitation has a large impact on society and agriculture. It can cause flooding, crop loss, landslides and other major disruptions such as the events along the lower reaches of the Yangtze River in China in May 2016 (Li et al. 2018a) or during the winter of 2018/2019 (Hu et al. 2021). Du et al. (2019) indicated that persistent extremes were increasing in most regions globally. China is particularly exposed to heavy precipitation and has experienced intensified events during the past decades (e.g. Li et al. 2012; Zhai et al. 2005; Qian et al. 2007; Gu et al. 2017). However, Wu et al. (2019) showed that this change may not be spatially uniform, with observed decreasing trends over the northern regions. Compared to short lasting extreme precipitation, e.g., those related to tropical cyclones (e.g. Pei et al. 2018), sustained precipitation over several days is often related to the monsoon seasonal circulation (Wu et al. 2019). Eastern China is particularly vulnerable to this type of persistent precipitation, as highlighted by many recent studies (e.g. Chen and Zhai 2013; He and Zhai 2018).

Anthropogenic influence on heavy precipitation has been discussed in many previous works at a global scale (e.g. Min et al. 2011; Fowler et al. 2021) or focused on China (e.g. Burke et al. 2016; Li et al. 2018b; Sun et al. 2019). To predict future changes in such events, one must first have a clear idea of whether climate models are able to reproduce the observed signal over the historical period. If they prove to be unrealistic or inconsistent with observation, then climate projections with such models are unlikely to be trustworthy. This paper aims to answer the following questions: Can the Coupled Model Intercomparison Project Phase 6 (CMIP6) climate models, when ran with historical forcings Gillett et al. (2016), reproduce the observed persistent precipitation signal over Eastern China? If so, are they reproducing the signal for the right reasons? And what are the expected future changes in risk?

The study builds upon Tian et al. (2021) (referred as Part I hereafter), focusing on the summer 5-day maximum precipitation (RX5d). The ability of models to reproduce the main patterns of the observed RX5d over Eastern China and the related circulation features are analysed. This provides a solid physical ground to conduct further analysis with the models, investigating past and future changes in the risk of extreme RX5d. The impact of model horizontal resolution in reproducing the observed signal is also investigated. Previous studies have shown, for the Coupled Model Intercomparison Project Phase 5 (CMIP5) models (Taylor et al. 2012), that there was a complex non-systematic relationship between the models’ resolution and their ability to simulate the monsoon circulation and the associated precipitation (Wu et al 2017). Freychet et al. (2017) indicated that even with higher resolution models, discrepancies between observed and simulated precipitation could be significant and are highly model-dependent.

Data and techniques used for this study are presented in Sect. 2. Section 3 provides a complete model evaluation analysis. In Sect. 4, recent and future changes in RX5d are investigated. Finally, discussion and concluding remarks are presented in Sect. 5.

2 Data and method

2.1 Datasets

32 individual models from CMIP6 (Eyring et al. 2016) are selected, based on the availability of daily precipitation data (Table 1) for the historical period (1850–2014). For multi-members individual models, analysis is computed for each ensemble member individually and then averaged to provide a single output for the model. To compare with models, a dataset of daily precipitation over China is used (OBS), provided by the National Meteorological Information Center (NMIC), China Meteorological Administration (CMA). This data was rigorously quality controlled, as described in Chen and Frauenfeld (2014) and is available over 1961–2017. For all analyses comparing models with OBS, the 1961–2014 historical period is used as it is common to all datasets.

2.2 Heavy precipitation definition (RX5d)

The analysis focuses on June–July–August (JJA). Heavy precipitation is defined as the maximum 5-day accumulated precipitation (RX5d) during the JJA season at each grid point. To allow direct comparison between datasets, all model and observed daily precipitation are first interpolated (bilinear interpolation) to a N96 grid (about 2\(^{\circ }\) horizontal resolution, with 192 points in longitude and 145 points in latitude) before computing RX5d. Eastern China is defined as Mainland China bounded by 103\(^{\circ }\) E–135\(^{\circ }\) E and 17\(^{\circ }\) N–55\(^{\circ }\) N. As the focus is on socio-economically damaging events and for consistency with OBS, we consider only land points. For models that use a fractional land/sea grid, land points are defined as the ones with a fractional land area \(> 0.5\).

Note that the Rx5d computation used here does not necessarily guarantee persistent rainfall for 5 days. Indeed, Rx5d could be due to an exceptional heavy daily rainfall only. It was however verified that in the majority of cases Rx5d is related to several days of rain, and that only a very few cases arise from a single day (or 2 days) of rainfall. Moreover, it is also verified that models are consistent with the observed signal. Thus, both models and observation include a diversity of events in the definition of Rx5d.

2.3 RX5d modes of variability (EOTs)

As in Part I, the main modes of variability for RX5d are identified using Empirical Orthogonal Teleconnections [EOTs, Van den Dool et al. (2000)]. This consists of finding a base point, in the defined domain, which has the highest correlation (along the time dimension) with all other grid points of the domain. This is considered as the first EOT base point and the correlation pattern with other grid points is the first EOT pattern. Then, the linear relationships between the EOT base point and the other grid points is subtracted (to remove the first mode of variability) and the process is repeated to find a second base point explaining the most variability of the retained signal, and so on for each successive EOT. This method finds the main modes of interannual variability along the time dimension. It indicates how much a given base point can explain the variability of all other grid points of the domain. As RX5d represents summer maxima, positive correlations indicate that RX5d covaries similarly at interannual timescale across space. This does not mean that the RX5d maxima are reached at the exact same day each year across the domain. Similar methods were used in Part I to evaluate the precipitation signal in observations.

In Part I EOT base points were identified in OBS and only those modes which at least 5% of the total domain variability are used. This gives four modes (see Part I). They correspond to Northern China (EOT1), south of the Yangtze river (EOT2), Southern China (EOT3) and the north Yangtze river basin (EOT4). The distribution of these modes highlights the latitudinal organisation of heavy precipitation in this region, which will be discussed later and they correspond to different phases of the East Asian summer monsoon. EOT2 and EOT4 patterns look similar but they correspond to different phases of a strengthened monsoon front. EOT2 is related to a stronger front around 28\(^{\circ }\) N during the early summer (mid-June) while EOT4 is related to an enhanced front around 34\(^{\circ }\) N and happens later in the summer, from late June to July. Identical base points are used for this work and their regression patterns are shown in Fig. 1. This means that EOTs patterns in the models are computed by regressing RX5d on each base point identified in OBS. With this forced EOT method, the main modes of variability are similar in OBS and models. However, it does not necessarily mean that the associated patterns or mechanisms are similar. This is precisely what will be investigated in the following section.

For each EOT, a base region is defined as a 3 \(\times\) 3 grid box (6\(^{\circ }\) \(\times\) 6\(^{\circ }\)) centred on the base point. For intensity computations, RX5d is then averaged over this base region. This is consistent with the average scale of EOT patterns (Fig. 1) and makes sure that results are insensitive to a single grid point. Thus, when referring to the intensity of an EOT, it is the mean intensity of RX5d over the base region (3\(\times\)3 grid box around the base point).

2.4 Test for significant model-observation differences

Due to the chaotic nature of precipitation, even a perfect model would not reproduce identical patterns to the observations. To evaluate the range of error expected when correlating precipitation patterns between models and observations, an auto-correlation test is used. With an ensemble size of 16, the UKESM1-0-LL (Sellar et al. 2019) model is used to explore this. EOT patterns from one member (REF) are considered as the reference [(i.e. following the same “simulated observation” methodology as in Freychet et al. (2021)]. The other members are considered as model results, either by using single members or by pooling together several members together (up to seven members together). The spatial correlations between REF and the different model result cases are computed. Then a different member is chosen as REF and the process is repeated. Results are combined together to estimate the distribution of correlations when correlating “observations” to themselves, as a function of ensemble size. Thus it shows the range of correlations where a model signal can be considered as consistent with observation patterns.

2.5 Regression analysis

A regression analysis is used to establish the link between large scale circulation and the EOT signals. This analysis is performed at both the seasonal and 5-day time scales. For the former, the summer mean of a dynamical variable is regressed on the interannual variability of each EOT intensity. This links the strength of Rx5d to the seasonal circulation. For the latter, the 5-day mean of a dynamical variable is extracted each year when Rx5d occurs. Then this signal is regressed on the interannual variability of EOT intensity. This allows identification of the circulation patterns related to each EOT.

2.6 Change in risk of RX5d

The change in the most intense RX5d events is investigated by comparing different periods in the CMIP6 ensemble. The reference period used is 1961–1980 as that is the earliest period with accessible and reliable observations. The historical change is defined by the shift between the reference period and 1991–2010. Three projection periods are also defined, using data from the Shared Socioeconomic Pathways 3 scenario (Low challenges to mitigation, high challenges to adaptation) leading to an increase in 7.0 W.m\(^{-2}\) radiative forcing by the end of the century (SSP370, Riahi et al. (2017)), when individual models reach + 1.5, + 2 or + 3 \(^\circ\)C global mean warming levels (at decadal-scale) compared to the pre-industrial (1850–1900) period. To identify the warming targets, a method similar to Dosio et al. (2018) or Slater et al. (2021) is used. SSP370 was selected based on the number of individual model with available daily data at the time of the analysis and also because all models reach the three considered warming targets in this scenario. To avoid potential over-representation of a single model, only the first member of each individual model is selected. Results using all members of each model were also tested but were not found to be significantly different.

For each period, RX5d events from 20 years of all models outputs are pooled together and a generalized extreme value (GEV) is fit to the pooled data. Extreme RX5d is defined as the 95th percentile (P0) from the reference period. Then the probability (P1) to reach similar RX5d values is computed for other target periods based on their GEV fits. The Risk Ratio (RR) is then calculated as RR = P1/P0. Thus, a RR larger (smaller) than 1 indicates an increase (decrease) in the probability of extreme RX5d events compared to the reference period. Confidence intervals on the RR are computed with a bootstrapping method, reproducing the RR a thousand times from random samples and taking the 5–95% confidence interval.

2.7 Dynamics

To study the circulation mechanisms related to RX5d, several dynamical fields are used. The sea level pressure and the horizontal wind at 850 hPa show the low level circulation. The zonal wind at 200 hPa is used to show the East Asian jet position. The geopotential height at 500 hPa indicates the state of the mid-troposphere. To study the transport of humidity, the moisture flux convergence is computed using vertically integrated atmospheric moisture and wind, and its dynmical and thermodynamical contributions are separated following the methodology from Wu et al (2017) and Tian et al. (2019). Finally, the impact of the ocean state is considered through the sea surface temperatures.

Fig. 1
figure 1

Regression patterns (mm/5-day) of the seasonal 5-day maximum precipitation on the first four normalized EOTs identified in OBS during 1961–2014 period. Regressions are computed against the normalized precipitation at the base point (shown by orange symbols) of each EOT

Fig. 2
figure 2

Boxplot: Auto-correlation test with UKESM model (gray boxplots), as explained in Methods section. Boxes show the 25th–75th quantile range, the notches being the confidence interval around the median, and bars indicate 1.5 \(\times\) the interquantile range outside of the box. X-axis indicates the number of ensemble members used to compute the ensemble average. The boxplots show the distribution of the correlations between the mean from sub-ensembles of UKESM with a single UKESM simulations. Colour symbols: correlation between CMIP6 EOT patterns and Observations. Blue and red symbols indicate higher and lower resolution individual models respectively. Orange symbols with black contouring show other models (i.e. medium resolution). The location of these symbols along the X-axis indicate the number of members averaged (across a single model) before computing the correlation. Note that symbols are slightly offset around their x positions to avoid too much superimposition. Models with more than 7 members are displayed on the right border of the plots

Fig. 3
figure 3

As Fig.1 but for the multi-model ensemble mean of the regressions. Regressions are computed against the normalized precipitation at the base point (orange circles) of each EOT defined from OBS. Results are shown for all models (top), high resolution models only (higher than 1.1\(^{\circ }\), middle) and low resolution models only (coarser than 1.8\(^{\circ }\), bottom)

Fig. 4
figure 4

Top row: RX5d mean intensity of each EOT base region (mm/5-day) in OBS (white star), individual models (small coloured symbols) and ensemble mean of all (large white diamond), higher resolution (large blue diamond) and lower resolutions (large red diamond) models. Numbers refer to individual models listed in Table 1. Grey shading indicates OBS two standard error (computed from the interannual variability) around its mean. Bottom row: Interannual variability (mm/5-days; defined as 1 standard deviation) of individual models (symbols) and OBS (grey shading)

Fig. 5
figure 5

a Distribution of EOT timings (from June 1st to August 31st) for each individual model (light thin lines), multi-model ensemble (dark thick lines with circle symbols) and OBS (dark thick lines with diamond symbols). Ensemble results are obtained by pooling together results from each individual model and then computing the timing distribution. The mean, 25–75% and 10–90% intervals are shown by the symbols, solid lines and dashed lines respectively. b and c: As (a) but for higher (top) and lower (bottom) resolution models

Fig. 6
figure 6

Regression patterns of total moisture flux convergence (MFC, mm) coinciding to the time when Rx5d occurs, its thermodynamic (Th) and dynamical (Dyn) contributions to normalized EOT signals. The base point of each EOT is shown by the orange circles. Dotted area indicates significant results at 95% confidence level

Fig. 7
figure 7

As Fig. 6 but for the JJA seasonal mean sea surface temperature regression. Dotted area indicates significant results at 95% confidence level. Also note that the region is different from Fig. 6 to allow for a broader view of SST patterns

Fig. 8
figure 8

As Fig. 7 but for seasonal-atmospheric variables. Left: Sea level pressure (shading, Pa) and 850 hPa horizontal wind (vectors). Right: 500 hPa Geopototential height (contours, m) and 200 hPa zonal wind (shading, m/s). For geopotential height, thick coloured contours indicate a signal above the 95% confidence interval. For other variables, only signals above the 95% confidence interval are shown

Fig. 9
figure 9

Taylor diagram of individual model (circles) and ensemble mean (diamond symbols) performance in reproducing regression patterns associated with each EOT. Colors correspond to different dynamical fields (caption in the center of the figure) and summer total precipitation (TP). The black star, in each plot, shows the reference (ERA5)

Fig. 10
figure 10

Generalized Extreme Value distribution fit to EOT base region intensity distributions (mm/5-day) during five periods: 1961–1980 and 1991–2010 historical periods (dark and light blue respectively), + 1.5, + 2 and + 3 \(^{\circ }\)C projection periods based on the SSP370 scenario (orange, red and dark red lines). The black vertical lines show the position of the 95th percentile from the 1961–1990 period

Fig. 11
figure 11

Same as Fig. 10 but for the total JJA seasonal precipitation at EOT base region (mm) using a normal distribution

Table 1 CMIP6 individual models used for the study along with their number of members
Table 2 Risk Ratio (RR) associated with the change in extreme precipitation modes (EOTs) and seasonal mean precipitation

3 Ability of CMIP6 to reproduce observed RX5d

3.1 Characteristics of RX5d modes

The characteristics of the main modes of interannual variability of RX5d are analysed. Their magnitudes, spatial structures and seasonal-cycle timings are investigated. Higher and lower resolution models (1\(^{\circ }\) or less and 1.8\(^{\circ }\) or more, respectively) are separated in two sub-ensembles to highlight the potential role of horizontal resolution in the ability of models to represent EOTs. As a reminder to Part I, the observed spatial patterns of EOTs are shown in Fig.1.

When looking at the performance of individual models (Fig. 2, coloured symbols) they are all within the range of internal variability (represented by the grey box plots in the figure, corresponding to the significance test detailed in Sect. 2.4). No individual model can be excluded based on the EOT spatial correlation results. Moreover, no systematic difference is found between higher and lower resolution models. Models with several members tend to have a better correlation with OBS, as expected from the auto-correlation test (Sect. 2.4). This suggests that using multi-member model means is beneficial to get patterns closer to the observed signal, at least for the events considered in this study. The model-OBS correlations are generally weaker for the EOT1 regression patterns. This mode corresponds to the northward migration of the monsoon front at the end of the summer, and based on these results, current climate models seem to have difficulties reproducing the latitudinal migration of the mode or its extent correctly. But the auto-correlation test also indicates a large internal variability of patterns, with correlations between members sometimes reaching close to 0. This means that even within the same model, North China precipitation signals vary greatly and it is difficult to define a robust pattern.

The spatial pattern of the normalized simulated EOT in the models (Fig. 3) is similar to the observed EOT, irrespective of model resolution. The main difference between higher and lower resolution models is the strength of the regression coefficient around the base point. Lower resolution models have slightly higher coefficients suggesting that precipitation is correlated over larger scales. All models fail to capture part of the EOT1 signal over the Yangtze river basin. Indeed, in OBS this first mode of variability has heavy precipitation over North-East China and a second weaker positive precipitation maxima over Central-East China, with an overall positive signal over the remaining areas of East China. In the models, only the maxima over NE China is visible while no signal is visible over the Yangtze river (where the second maximum should be located). Thus, precipitation may be too systematically confined to the North part of the region in the model. This could lead to an underestimation of extreme precipitation risk associated with this mode, especially over the Yangtze river basin.

The intensity of each RX5d mode for model simulations is different from OBS, ranging from 80 mm for EOT1 to 170 mm for EOT2 (Fig. 4, top row). Although this contrast is broadly captured in most of the models, they tend to overestimate the mean intensity of EOT1 while underestimating the intensity of EOTs 2 and 4. Resolution has a clear effect on the magnitude of RX5d, with higher resolution models showing more intense precipitation. However, this does not mean a systematic improvement of performance by increasing resolution. Indeed, for EOT1 for example, higher resolution models have larger biases than do lower resolution ones. It is noticeable that the multi-model ensemble means and most of the individual model means are not within the range of OBS variability (gray shading). There is also less consistency between models for EOT3 (southern China), many of them being outside the range of OBS variability. Another noticeable point is that models tend to overestimate the interannual variability of RX5d (Fig. 4, bottom row), especially for EOT1 and EOT3, which implies wider distributions and potential discrepancies between OBS and models for the most extreme events.

The timing of each EOT is important as it indicates their link to monsoon front migration during summer. EOTs 2, 3 and 4 tend to occur from early to mid-summer (Fig. 5a) which is the mature phase of the summer monsoon. EOT1 occurs much later in the summer, during the latest phase of the monsoon when it migrates to its northward limit (Chen et al. 2004). The multi-model ensemble-mean is close to the observed signal and reproduces well the different timings. It is noticeable that the timing of each EOT has considerable variability. Thus, even if each EOT corresponds preferably to one phase of the summer monsoon, they can still happen outside of these phases. This large variability is present in OBS, individual models and multi-models ensemble results. Again higher resolution is not found to improve the model results systematically (Fig. 5b, c).

In summary, these results suggest that the CMIP6 models are able to capture the main spatial and temporal characteristics of each EOT. The performance of ensemble means (either multi-member or multi-model) are usually better than each individual realisation. Model resolution does not seem to impact the results presented here in a systematic way.

3.2 Dynamics of main RX5d modes

The main mechanisms associated with each EOT are investigated in this section. Part I indicated that EOT1 is related to strong East Asian Summer Monsoon and northward displacement of upper-tropospheric westerly jet, EOT2 and EOT4 are related to an enhanced and stable monsoon front and strong western North Pacific subtropical high, and EOT3 is associated with anomalous southerly wind bringing moist air from the South China Sea. The objective here is to verify that models represent extreme precipitation correctly for the right reason, i.e. with the same dynamical signal as in observations. Figures from observations are not shown here and readers are referred to Part I of the study for more details. The regression method is described in Sect. 2.5 of this paper.

The vertically integrated moisture flux convergence (MFC) patterns corresponding to the different EOTs are shown in Fig. 6. As expected, MFC is enhanced around the base point of each EOT. The MFC patterns associated with EOTs 2 and 4 indicate an overall increase in the monsoon front intensity, with a clear extension eastward. EOTs 1 and 3 are confined further inland. These features closely resemble the observed signal (Part I, their Fig. 5).

The contributions of thermodynamics and dynamics to MFC variability are also analysed following Part I. The former is related to a change in atmospheric moisture content while the second corresponds to a change in moisture transport contributed by changes in atmospheric circulation. Overall, dynamics is the dominant contributor for each EOT. This means that Rx5d is mostly associated with a change in the monsoon dynamics, with stronger winds bringing more moisture over land. However, for the Yangtze river modes (EOTs 2 and 4) thermodynamics enhances the dynamical part. This contribution is associated with sea surface temperatures (Fig. 7), with both EOT2 and 4 presenting very similar seasonal regression patterns. They both show a warming over the South China Sea region (significant pattern in the figure), suggesting that enhanced evaporation is feeding the thermodynamic component of the MFC. The enhanced moisture availability over sea is then brought over land and increases precipitation. Again, all these features are also found in observations (Part I). The models seem to be able to reproduce the observed partitioning between dynamical and thermodynamical contributions to the MFC.

Results from Part I indicated that EOT interannual variability can be related to the interannual summer precipitation, and hence to the dynamics of the mean JJA precipitation. Following this finding, a regression analysis is conducted on key seasonal mean atmospheric variables (Fig. 8).

EOT1 is characterized by a low level anticyclonic circulation in the east of China, leading to enhanced moisture transport from the south toward North China (corresponding to the centre of this mode). This circulation anomaly may be related to the northward displacement of the East Asian upper-tropospheric jet stream, shown by a dipole anomaly of the 200 hPa wind (Fig. 8). This results in a mid-tropospheric positive geopotential anomaly and a lower sea level pressure over land, i.e. an enhanced monsoon circulation. The increased sea-land pressure gradient explains the increased southerly wind anomaly along China’s East coast [as shown by Lin (2013)]. These features are in agreement with the observed signal (Part I).

Yangtze river modes (EOTs 2–4) are associated with increased sea level pressure over the West Pacific region, leading to enhanced moisture transport onshore. This westward extension of the West-North Pacific subtropical high is weaker for EOT2 than for EOT4, explaining the difference in latitudes of each EOT (EOT2 being confined more to the south of the Yangtze river while EOT4 extends to its north). This echos previous studies showing that extreme precipitation events over the Yangtze River Basin were associated with air–sea heat fluxes and moisture transport (e.g. Gao and Xie 2014; Gao et al. 2016). Part I indicated that these modes were also related to a decaying phase of winter El-Nino and warmer Indian Ocean SST. Regression patterns in the models (Fig. 7) show similar positive anomalies over the Indian Ocean and South China Sea. Finally, both modes are also associated with an increased (reduced) East Asian jet stream north (south) of their base point, corresponding to enhanced convection.

The Southern mode (EOT3) is associated with a low-level dynamic anomaly over the South China Sea, with enhanced moisture transport from the sea converging to the land. In the models, the SST have a La-Nina like pattern. Although a warmer West Pacific and Indian Ocean is also observed in the reanalysis (Part I), the tropical eastern Pacific signal is opposite to the observed. This inconsistency may reflect the large variability in EOT3 intensity produced by models (Fig. 4) and highlights the difficulty that models have correctly simulating precipitation in southern China (Chen and Frauenfeld 2014).

The previous results indicate an overall good performances from the CMIP6 models ensemble mean to reproduce the dynamics associated with the four main modes of variability of extreme precipitation over Eastern China. Thus, this group of models is not only able to simulate each EOT pattern and intensity but also capture the right dynamical processes associated to the variability of each EOT. Individual model performance is more variable, as shown in Fig. 9 [and also see Xu et al (2021)]. The ensemble mean tends to be better than individual models especially in terms of correlation, highlighting the importance of using multi-model ensembles for higher confidence in results. Still, ensemble mean correlations may seem lower than expected. This is because regression patterns are used here instead of direct dynamical fields or composites. Regression analysis usually leads to smoother spatial signals, with the multi-model mean smoothing even more these patterns. Thus it is more difficult to obtain strong correlations with observation.

4 Changes in RX5d risks

Having validated the models, the change in risk is now investigated (see Methodology section for more details). Changes in the distribution of each EOT’s Rx5d intensity are displayed in Fig. 10 and the Risk Ratios (RR) associated with the most extreme cases (95th percentile of the reference period or a 1-in-20-year event) are provided in Table 2. For all cases, models indicate a weak change between the 1961-1980 and 1991–2010 periods, although all EOTs, except the second one, have a positive RR. The shift becomes more apparent with higher warming levels. Almost all RRs are significantly positive, ranging from about 1.5 to 2.5 (for the + 1.5 \(^{\circ }\)C and + 3 \(^{\circ }\)C targets respectively), meaning that the risk of reaching the 95% values could be 50–150% larger than during the 1961–1980 period. Changes in mean seasonal precipitation distribution, computed at the base point of each EOT (Fig. 11), indicate overall weaker and more uncertain signals. Especially, during the historical period the seasonal RR is below 1 for each EOT location, indicating reduced seasonal-mean rainfall, different from the generally increased risk for 5-day extremes. This may be related to a change in dynamical properties during the mid-90s, as shown by Wu et al (2020). In the projections, extreme precipitation modes emerge more clearly and strongly from the inter-model variability than does mean precipitation, in agreement with previous studies (e.g. Westra et al 2014; Freychet et al. 2015, 2016; Burke and Stott 2017; Dong et al. 2020).

The above results strongly suggest an intensification of EOTs with future global warming, although the current signal is still weak. This means that China is likely to experience enhanced risks of 5-day extreme precipitation in the upcoming decades, with increased risks of all associated consequences [for example flooding, damage to infrastructures and crops (Zhang and Zhou 2020).

Note that these projections are scenario-dependant despite the fact that they correspond to specific warming targets. Indeed, results are here related to the SSP370 pathway and can be influenced by its projected changes in aerosol emissions. For example Wang et al (2021) showed that some scenarios adopted for the CMIP6 simulation underestimated the recent changes in aerosols over East Asia. A complete analysis of the impact of aerosol projection in the different SSP pathways would need to be conducted to answer this specific aspect.

5 Discussion and concluding remarks

Following the work of Tian et al. (2021), we conducted an evaluation analysis for the latest version of the Coupled Model Intercomparison Project, CMIP6. These results are complementary to Zhu et al (2020) who investigated CMIP6 performances for different extreme precipitation indices.

CMIP6 multi-model ensemble mean results are consistent with the observation. The ensemble is able to reproduce the patterns of the main modes of interannual variability of extreme precipitation, along with a good representation of seasonal timing and magnitude. It can also reproduce most of the main features of the circulations associated with these modes. Individual model performances are also overall in agreement with observation, although there are some discrepancies especially for the dynamical patterns. We also noticed that models tend to over-estimate the variability of North and South china modes, corresponding to a wider distribution and larger intensity compared to observation. Thus, results for these two regions may be slightly less reliable.

The resolution of the models is not systematically linked to their performances. We showed that using multi-member or multi-model ensembles is more beneficial than increasing the horizontal resolution. CMIP6 is found to be overall less reliable for South China region precipitation (EOT3), consistent with Tang et al. (2021) who highlighted similar results using a different methodology, although other studies also indicated a slight improvement in model performances from CMIP5 to CMIP6 (e.g. Dong and Dong 2021; Xu et al 2021). We believe that even high resolution climate models are still too coarse to correctly represent convection and local processes impacting precipitation (the highest resolution models are around 100 km scale, which is far too coarse to resolve convection). A note of caution though: Our conclusion does not mean that all models are equal or that the number of members if the only important factor. Horizontal resolution is only one aspect considered in our study. Models could also be grouped by their physical schemes (or other core schemes). This could highlight some systematic differences between model families, but it is beyond the scope of this study. For example Chen et al. (2021) have shown a weakened dependency between horizontal resolution and model performances between CMIP5 and CMIP6 (for precipitation over East Asia and West North Pacific), indicating that improved physical framework and parameterization may play a major role.

After validating the model performances, we conducted a risk change analysis for both the historical period and future global warming targets. Although the risk change is small and not always clear during the historical period, which is in opposition to the risk change in seasonal mean precipitation, the future projections indicate a clear increase in the risk of reaching the most extreme values for each EOT i.e. overall intensified extreme precipitation, consistent with findings from Xu et al (2021). These results highlight the vulnerability of Eastern China to disasters associated with heavy precipitation.

In this study we found that using the CMIP6 ensemble to analyse changes in extreme precipitation over Eastern China is reliable. However we stressed that although the ensemble mean shows good performance, individual model results vary. Using large ensembles remains a solid method to improve confidence and enables reliable results to be extracted from noisy climate change signals even in the case of extreme precipitation. Also, this study focused on seasonal precipitation. Thus, the question on whether increasing resolution improves the fidelity of shorter time-scale heavy precipitation in the models remains open.