1 Introduction

Changes in precipitation can have adverse impacts on society more directly than changes in most of the other meteorological variables. Of particular concerns are changes in extreme precipitation. Intense rainfall events can lead to flash floods, often resulting in infrastructure damage, considerable impact on natural ecosystems and even human casualties. Over the Asian domain, the largest socioeconomic losses are linked to floods attributed to these extreme rainfall events (Roxy et al. 2017; Vellore et al. 2014, 2016; Goswami et al. 2006; Krishnan et al. 2015; Pai and Sridhar 2015; Rajeevan et al. 2008). Understanding and quantifying their magnitude for the present and how they may change in the future is of immense importance to the society.

Precipitation is difficult to represent in climate models owing to its high variability across nearly all temporal and spatial scales. Convective parameterization increasingly dominates events of extreme preciptation (e.g., Berg et al. 2013). In climate models, simulated precipitation occurs more frequently but is less intense than observed heavy rainfall (Trenberth et al. 2011). One factor that greatly influences the accuracy of modeled precipitation is the size of model grid cells. In particular, horizontal resolution has a stronger impact on precipitation extremes than does mean precipitation, especially in the tropics (Li et al. 2011). Extreme rainfall can arise from two different scales of rain-bearing systems: synoptic-scale (100–1000 km) and meso-scale (1–100 km); therefore, climate models with less than 100 km horizontal resolution could greatly improve quantification of the extreme events due to meso-scale phenomenon. Despite this, research on the dependence of extreme daily rainfall on horizontal resolution is more limited than that focused on mean precipitation. This may reflect insufficient reference datasets (i.e., observational datasets) on daily precipitation as compared with monthly mean data with finer grid resolution over longer periods and/or it may also be due to a shortage of long-term high-resolution climate simulations. Previous studies have restricted their investigations on extreme precipitation over Asia by using diverse observational and reanalysis datatsets (Rana et al. 2015; Ceglar et al. 2017; Huang et al. 2016).

To date, general circulation models (GCMs) having resolution of less than 100 km have been explored by only a few research institutes owing to the enormous computer resources required. In general, computational costs limit GCMs to grid spacing of larger than 100 km, resulting in powerful representation of global scale circulation responses and continental rainfall, but less effective representation of the regional and local scales required by most decision makers (Rummukainen 2010). For this reason, Regional Climate Models (RCMs) have been used as an alternative. While high-resolution RCMs are computationally less expensive and have the ability to resolve finer scale orographic precipitation, they require the specification of lateral boundary conditions, which inhibit self-consistent interactions between global and regional scales of motion (Fox-Rabinovitz et al. 2006). Recently many evaluation studies using RCMs have been conducted through Coordinated Regional Climate Downscaling Experiment (CORDEX) program (Kim et al. 2014, 2015; Huang et al. 2015; Zhou et al. 2016; Zou et al. 2016; Pattnayak et al. 2017). Especially, there have been many interests and efforts regarding added value of RCMs (Di Luca et al. 2015; Torma et al. 2015).

A hierarchy of phenomena with various spatial and temporal scales makes up climate and better representation of small-scale features (e.g., Baiu rain band and tropical cyclones) in high-resolution models has improved seasonal mean climate simulations (Kitoh et al. 2008). Simulations using the Community Atmosphere Model (CAM3) show robust systematic improvements with higher horizontal resolution for a variety of features, most notably for those associated with large-scale dynamical circulation (Hack et al. 2006). This resolution dependence is largely due to the specification of better-resolved surface boundary conditions (e.g., land-sea mask, soil and vegetation parameters) (Schiemann et al. 2014). Similarly, Li et al. (2011) also reported that extreme precipitation has a stronger sensitivity to horizontal resolution in terms of a globally zonal mean when using the CAM3 simulation. High-resolution climate models that resolve topographical features and capture fine scale climate processes (e.g., surface moisture and snow albedo feedbacks) (Diffenbaugh et al. 2005) can more accurately simulate observed precipitation extremes (Walker et al. 2009). For example, owing to its high horizontal resolution, the 20-km mesh the atmospheric general circulation model (AGCM) captures orographic rainfall that is accurate in terms of both location and amount. Through its influence on resolved dynamics, horizontal grid resolution strongly affects the distribution of the Intertropical Convergence Zone (ITCZ) and the amount of tropical precipitation (Abiodun et al. 2008). Sensitivity experiments with the Community Earth System Model (CESM) by altering horizontal grid resolution demonstrated reduced biases at the highest resolution (0.23° × 0.31°) over Europe, the USA, and Australia (Kopparla et al. 2013). Furthermore, the fraction of large-scale precipitation was seen to be larger at high-resolution (Bacmeister et al. 2013; Kopparla et al. 2013). Volosciuk et al. (2015) found that the effects of averaging and representation of physical processes in ECHAM5 model at different horizontal resolutions vary with region and season.

In contrast, other studies have found fewer benefits to increasing horizontal resolution in climate models (Cardoso et al. 2013), arguing that without improvement in physics representation, increased resolution alone may only provide limited improvement (Buizza 2010). Shallow and convective parameterizations can cause unrealistic development of storms that produce intense precipitation in high-resolution GCM simulation (Williamson 2013). For the Model of Prediction Across Scales-Atmosphere (MPAS-A), increasing grid resolution to 30 km lead to a problem with double peaks and double ITCZ in both mean and extreme precipitation (Landu et al. 2014; Yang et al. 2014).

Recently, more projects have started to use GCM simulations with a resolution of < 100 km. For example CMIP5, CMIP6; UK PRACE, weather-resolving Simulations of Climate for globAL Environmental risk (UPSCALE); High resolution Global Environmental Modeling (HiGEM); Kyosei-project, KAKUSHIN-program. However, to date, studies have only demonstrated the effects of high-resolution on rainfall through single climate model, but not various models. (Chen et al. 2008; Kopparla et al. 2013; Schiemann et al. 2014; Volosciuk et al. 2015; Wehner et al. 2010; Yang et al. 2014). Furthermore, these studies have mainly considered extreme rainfall over a global or continental (e.g., the USA) scale, but few have focused on the Asian monsoon region. Evaluation of the Asian monsoon rainfall by classifying sub-monsoon regions remains important because sub-monsoon systems in Asia are independent of each other but, at the same time, interact with each other (Kripalani and Kulkarni 2001).

In this study, our main goal is to compare precipitation extremes over the Asian summer monsoon region from multiple reference datasets and from multiple high-resolution CMIP5 climate simulations and to ascertain whether the inferences based on different datasets are in agreement or they differ. Multiple datasets are employed to avoid dependence on results based on a particular dataset. Furthermore, we also examine the effects of model resolution on simulating characteristics of extreme precipitation climatology over the Asian monsoon region. The next section describes the datasets and the methodology employed.

2 Data and methods

2.1 The seven reference datasets

In this study, we used gridded daily precipitation datasets with a high-resolution. These datasets are roughly grouped into three categories: rain-gauge-based (APHRODITE, CPC-UNI), satellite-based (TRMM, GPCP1DD), and reanalysis (ERA-Interim, MERRA, and JRA55). To ensure a common period among datasets, we selected data for period from 1998 to 2007. A brief summary of the datasets is given in Table 1. Detailed description of the datasets follows.

Table 1 Brief summary of datasets used in this study
  1. 1.

    The APHRODITE (Asian Precipitation Highly Resolved Observational Data Integration Towards Evaluation) project has developed state-of-the-art daily precipitation dataset with a high-resolution (0.5° lat/lon grid) grids for the Asian region. APHRODITE’s Water Resources project has been executed by the Research Institute for Humanity and Nature (RIHN) and the Meteorological Research Institute of Japan Meteorological Agency (MRI/JMA) since 2006 (Yatagai et al. 2012). This dataset is generated primarily with ground-based data obtained from an in-situ rain-gauge-observation network between 5000 and 12,000 stations.

  2. 2.

    Climate Prediction Center unified (CPC-UNI) is a global gauge-based daily precipitation product from the National Oceanic and Atmospheric Administration (NOAA). Gauge reports from over 30,000 stations are collected from multiple sources, including the Global Telecommunications System (GTS), the Cooperative Observer (COOP) network, and other national and international agencies. Quality control is performed through comparisons with historical records and independent information from measurements at nearby stations, radar and satellite observations, as well as model forecasts. CPC-UNI uses optimal interpolation (OI) technique to represent area-averaged values of precipitation over grid boxes (Chen and Knutson 2008). Finally, quality controlled station reports are interpolated to create analyzed fields of daily precipitation with consideration of orographic effects (Xie et al. 2007). Here, we used the CPC-UNI, version 1.0 (v1.0), global land data at a 0.5°lat/lon grid.

  3. 3.

    The seventh research version of the Tropical Rainfall Measuring Mission (TRMM 3B42 V7) relies primarily on passive microwave (PMW) precipitation estimates from the Special Sensor Microwave Imager (SSM/I), the Special Sensor Microwave Imager and Sounder (SSMIS), the TRMM Microwave Imager (TMI), the Advanced Microwave Sounding Unit (AMSU), the Microwave Humidity Sounder (MHS), and the Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E). PMW data were first calibrated using the combined TMI and TRMM precipitation radar product (PR) and were then used to calibrate geosynchronous IR inputs (Huffman et al. 2007). After the preprocessing, the 3-hourly multi-satellite fields are summed for the month and combined with the monthly accumulated Global Precipitation Climatology Centre (GPCC) rain gauge analysis using inverse-error-variance weighting to form a monthly best-estimate precipitation rate, which is TRMM Product 3B43. The TRMM datasets range from 50°S to 50°N from 1998 to the present and is available at a 3-hourly temporal resolution in a 0.25° spatial grid.

  4. 4.

    The Global Precipitation Climatology Project (GPCP) sponsored by the World Climate Research Program and Global Energy and Water Cycle Experiment provides global precipitation products based on satellite and rain gauge information on a daily scale (Huffman et al. 2001). The GPCP 1-Degree Daily (GPCP1DD) version 1.2 dataset is produced by optimally merging estimates computed from microwave, infrared, and sounder data observed by the international constellation of precipitation-related satellites and precipitation gauge analyses.

  5. 5.

    The European Centre for Medium-Range Weather Forecasts (ECMWF) Interim reanalysis (ERA-Interim) is the most recent global atmospheric reanalysis produced by ECMWF covering the period from 1979 until the present (Dee et al. 2011). ERA-Interim uses four-dimensional variational data assimilation (4D-Var) with a 12 hourly cycle, a revised humidity analysis, variational bias correction for satellite data, and other improvements in data handling. These data are available at a daily resolution on a 0.5° lat/lon grid.

  6. 6.

    Modern Era Retrospective-analysis for Research and Applications (MERRA) uses a three-dimensional variational (3D-Var) analysis algorithm based on the Grid-point Statistical Interpolation (GSI) scheme with a 6-h cycle. The GSI was originally developed at NCEP and is now jointly developed by NCEP and the Global Modeling and Assimilation Office (GMAO). MERRA is produced by GEOS-5 (Goddard Earth Observing System) atmospheric general circulation model based on finite volume dynamics. It includes a number of advancements over the 3D-Var algorithms based on the GSI (Rienecker et al. 2011). GMAO at the National Aeronautics Space Administration Goddard Space Flight Center produce a satellite era analysis (NASA MERRA), which is available from 1979 to present. MERRA has data on a 2/3° longitude and 1/2° latitude native grid.

  7. 7.

    JMA performed the second reanalysis project (known as the Japanese 55-year Reanalysis) using TL319 version of JMA’s operational data assimilation system with 4D-Var scheme as of December 2009, and newly prepared dataset of past observations. The datasets cover the 55 years from 1958, when regular radiosonde observation began on a global basis. Many of the deficiencies of JRA25 are alleviated in JRA55 because the DA system used for the project featured a variety of improvements introduced after JRA25. As a result, the JRA55 project produced a high-quality homogeneous climate dataset covering the last half century. JRA55 has a reduced Gaussian grid and TL319 resolution (0.563° lognitude and 0.563° latitude grid).

2.2 CMIP5 model data

We used the Atmospheric Model Intercomparison Project (AMIP) simulations from 29 CMIP5 AGCMs for the 28-year period from January 1980 to December 2007 (Table 2). The models are grouped according to their spatial resolution: high-resolution (grid spacing of < 1°), medium-resolution (grid spacing between 1° and 2°), and low-resolution (grid spacing of > 2°). Each model uses different physics schemes. All the above described datasets are produced by different numerical models or assimilation systems, thus can be considered independent of each other.

Table 2 Summary of CMIP5 models

2.3 Analysis region

Our region of interest is the Asian summer monsoon region (15°S–45°N, 70–150°E: Fig. 1a). This region consists of South Asia (the Indian sub-continent), East Asia (China, Korea and Japan) and Southeast Asia (Myanmar, Thailand through to the Indonesian islands and the maritime continent). Considering the spatial features associated with Asian summer monsoon and the number of stations, for some analysis we separated the entire domain into three sub-regions: East Asian region (20–40°N, 110–150°E), South Asian region (5–25°N, 70–90°E), and Southeast Asian region (10–20°N, 100–110°E). Maritime continent also belongs to one of the Asian summer monsoon sub-domains, but are excluded when averaging over the Southeast Asian region.

Fig. 1
figure 1

a Asian Monsoon domain covering South Asia (India), East Asia and Southeast Asia. b 850 hPa Wind vectors for JJA based on ERA-Interim reanalysis dataset

2.4 Methodology

According to Yang et al. (2014), upscaling of fine resolution data into a coarse grid through averaging reduces signals/variances. Consequently, the mismatch between process and analysis scales often leads to difficulties in producing reliable statistics for the aggregated data. In this study, we apply to interpolate the datasets from coarse to fine resolution to compare the characteristics among the multiple datatsets.

Rainfall is a point process with large spatial and temporal discontinuities ranging from very weak to strong events within small temporal and spatial scales (Malik et al. 2012; Wulf et al. 2012). To define daily precipitation extremes, we used the percentile approach (Diffenbaugh et al. 2005; Malik et al. 2012; Kopparla et al. 2013; Singh et al. 2013). In particular, for a percentile threshold of p, we find the pth percentile of the distribution of daily rainfall over all the grids over the Asian domain. We then examine the spatio-temporal characteristics of rainfall exceeding this percentile, the absolute value of which will differ between different datasets. The advantage of this approach is that it removes the effects of bias in precipitation amounts between the different datasets, while retaining reliable information about precipitation pattern and behavior (Kendon et al. 2012). Here we calculate the 75th, 95th and 99th percentile of daily precipitation probability distribution function (PDF) at each grid point for June–August (JJA) of each year. The frequency of heavy rainfall is defined to be more than 30 mm/day (Bhowmik et al. 2007; Kusumoki et al. 2012; Agnihotri et al. 2015). In addition, we analyze the distribution of the right tail of the precipitation distribution, defined as the range between the 99th and 90th percentile (Scoccimarro et al. 2013, 2014), similar to Interquartile range (IQR), which is calculated as the 75–25th percentile. Furthermore, we evaluate the extreme precipitation fraction as a ratio of the 95th percentile precipitation to the precipitation climatology.

Next, we estimate the uncertainty in the seven reference datasets and in the CMIP5 models using the signal-to-noise ratio (SNR), which is calculated as:

$$SNR=\frac{X}{\sigma }$$
(1)

where X is an ensemble mean of the datasets and \(\sigma\) is variability (i.e., standard deviation among the datasets). The SNR here is used to measure the uncertainty. This uncertainty could come from various sources such as internal, model, and scenario based modes (Hawkins and Sutton 2011; Kim et al. 2016). Before we present the results on extreme precipitation analysis, a brief climatology during the boreal summer monsoon period is described next.

3 Observation analysis

3.1 Climatological features over the asian domain

Changes in precipitation extremes are primarily influenced by changes in large-scale circulation over the tropics (Maredith et al. 2015). Most of the reanalysis datasets reasonably reproduce the climatological circulation features. Hence, Fig. 1b displays the vector winds at the lower tropospheric level (850 hPa) derived from the ERA-Interim dataset. This figure clearly illustrates the summer monsoon circulation flow pattern. Over the South Asian region, the strong southwesterly flow over the Arabian Sea (0–20°N, 40–70°E) transports large amounts of moisture from the Indian Ocean towards the Indian sub-continent. This flow further gathers moisture from the Bay of Bengal and transports towards northeastern parts of India and towards the Myanmar–Thailand region. The most dominant feature over East Asia is the North Pacific Subtropical High. The southeasterly/southerly/southwesterly flow along the western edge of this high (Figs. 1b, 10°S–30°N, 110–130°E) transports moisture from the West Pacific Ocean towards China, Korea and Japan. Moisture from the South China Sea is also transported towards South China and the neighboring regions of Southeast Asia in particular Philippines and Vietnam. Maritime continent over the Indonesian region gets moisture from the South Pacific Ocean brought by the southeasterly winds. Thus the main moisture sources for the precipitation can be inferred from Fig. 1b. Next, we present the seasonal summer monsoon rainfall patterns as depicted by the seven datasets (Fig. 2).

Fig. 2
figure 2

Maps of summer monsoon precipitation climatology for the period 1998–2007 for the seven datasets: a ensemble mean based on the seven datasets, b GPCP1DD, c TRMM, d APHRODITE, e CPC-UNI, f ERA-Interim reanalysis, g MERRA reanalysis, and h JRA55 reanalysis

Over South Asia, large amounts of rainfall are located over the west coast of India (orographic effects), central and northeast India. Over South-east Asia, the Arakan and Myanmar coast (orographic effects) are regions of large amounts of rainfall. Over East Asia, regions of southeast China extended until the Korea–Japan peninsula display large amounts of rainfall. However, CPC-UNI does not capture rain over the Myanmar and the Arakan coast. Interestingly, ERA-Interim correctly reproduces very less precipitation over the southeast Indian peninsula (Fig. 2f). This region gets major proportion of annual rainfall during the October–December period (Kripalani and Kumar 2004). This same dataset also reasonably reproduces the climatology and Global Monsoon precipitation with highest skill (Lin et al. 2014). MERRA and JRA55 tend to overestimate rainfall over southeast Asia for boreal summer. More details on South and East Asian Monsoons are available in a recent paper (Preethi et al. 2017) and over Southeast Asia in earlier papers (Kripalani and Kulkarni 1997, 1998). Perfect reference datasets do not exist and there are large uncertainties in extreme rainfall among various datasets (e.g., Sunyer et al. 2013; Turco et al. 2013). Many previous studies have suggested use of rain-gauge datasets for reference, but they too have shortcomings, as the sites do not cover the required domain. On this basis, it is necessary to investigate the differences in extreme precipitation, if any, across the seven datasets before assessing the climate models.

3.2 Extremes based on percentiles

Figure 3 shows extreme rainfall for boreal summer in individual datasets and their ensemble mean in terms of the 95th percentile values in the daily precipitation PDF. Although the ensemble of our seven datasets is affected by smoothing, it still captures well precipitation in mountainous regions compared to each dataset. Over individual datasets, the distributions of extreme values are similar to the daily mean precipitation in terms of spatial characteristics, but show larger differences in magnitude. In particular, GPCP and TRMM (i.e., datasets based on satellite) display higher extreme precipitation intensity over the Asian monsoon region compared with the remaining five datasets. CPC-UNI exhibites lowest precipitation over Myanmar and the neighboring regions. This is similar to the findings of Rana et al. (2015), who showed that CPC-UNI poorly simulates seasonal mean precipitation over the same region. ERA-Interim also tends to show lower values over India and the maritime continent. As noted earlier ERA-Interim is the only dataset which displays subdued monsoon activity (Fig. 3f) over the southeast Indian peninsula. As pointed out in the preceding section, most of the annual rain over this region occurs during the October–December period (Kripalani and Kumar 2004). JRA55 is similar to the ensemble mean among the reanalysis datasets, but relatively shows higher rainfall values over Southeast Asia and Southeastern Asia. MERRA reproduces less rainfall over Korea and Japan.

Fig. 3
figure 3

Same as Fig. 2, but for the 95th percentile of precipitation

In summary, the main inferences drawn from Fig. 3 are interesting. The highest resolution dataset (TRMM 0.25°) clearly depicts the heavy orographic rainfall zones over South Asia, in particular the west coast of India and hilly region of northeast India and adjoining Bangladesh and the Arakan/Myanmar coast. This may be due to the low-pressure systems that form over the Bay of Bengal and transport moisture over these regions. This high-resolution dataset also brings out zones of heavy rain events over South China and the adjoining Thailand region over Southeast Asia. These heavy rain events may be due to the tropical storms over the South China Sea striking the south China and the adjoining coasts. A recent study (Zhan et al. 2017) reported that western China including the Tibetan Plateau has experienced a significant change in the extreme events over the past decades. The GPCP data at 1° resolutions also displays these regions of heavy rain events, to a certain extent. The remaining five datasets depict heavy rain events in the 20–30 mm/day range. In general, regions of heavy rain events are better displayed by high-resolution datasets. Thus the spatial patterns for the same variable (here the 95th percentile precipitation) show some differences among the various datasets considered here. However, the spatial distributions of the 95th percentile scores resemble the patterns obtained for the mean seasonal rainfall pattern (Compare Figs. 2, 3). Such inferences have been documented earlier (Boers et al. 2016).

Another distinctive feature of extreme event is the shapes of right tails in the precipitation PDF for JJA in the seven datasets. This is illustrated by the differences between the 99th and the 90th percentile values (Fig. 4), GPCP and JRA55 are similar to ensemble of all the datasets. TRMM shows a particularly intense distribution of the right tail between the 90th and 99th percentiles, while APHRODITE, ERA and MERRA show a relatively even smaller value of the right tail than the others (Fig. 4).

Fig. 4
figure 4

Same as Fig. 3 but for right tail distributions of the precipitation calculated for the difference between the 99th and 90th percentiles

Over South Asia, western parts of India through central India up to northeast India; over East Asia major parts of China and Korea–Japan and over Southeast Asia Vietnam, northern parts of Philippines heavy rain events with differences exceeding 35 mm/day are clearly depicted by the TRMM dataset. Similar patterns are displayed by the GPCP, CPC, and JRA55 datasets but with differences in the 25–35 mm/day range. MERRA shows less variation over the entire domain. In fact ERA and MERRA hardly show any differences over the maritime continent. The differences between the 99th and 90th percentile displayed by ERA and MERRA are considerably less compared to other reference datasets. It means that the range of extreme value is narrow.

3.3 Frequency of heavy precipitation days

In terms of the frequency of heavy rainfall (Fig. 5), GPCP and TRMM display high frequent events of heavy rainfall (≥ 30 mm/day) over South, East as well as Southeast Asia. The frequency of heavy rain case displayed by ERA-Interim is near zero over the Indian subcontinent. Although, both CPC-UNI and APHRODITE are rain gauge based datasets, they appear to have large differences over maritime continents and India (Fig. 5). Furthermore, frequency of occurrence of heavy rain displayed by APHRODITE, ERA and MERRA over southeast Asia appear to be considerably less compared with the other four datasets. In addition, frequency of heavy rain events appear to be well captured by the TRMM dataset over west coast of India and northeast India compared to GPCP, CPC-UNI, and ERA-Interim. Regions surrounding the South China Sea i.e. South China, Vietnam, Borneo, Philippines and the Indonesian islands also are well captured. Even regions receiving moisture from the West Pacific i.e. southeast China and the Korea-Japan peninsula show regions of heavy rain events. While the APHRODITE dataset captures the heavy rain events over South Asia, ERA dataset does not display any heavy rain events. CPC captures these events over Southeast Asia. Hence, there are differences among the datasets in displaying these heavy rain events, in particular over Southeast Asia and the maritime continent.

Fig. 5
figure 5

Number of heavy rainy days (≥ 30 mm/day) for the period 1998–2007, a GPCP, b TRMM, c APHRODITE, d CPC-UNI, e ERA-Interim reanalysis, f MERRA reanalysis, and g JRA55 reanalysis

3.4 Ratio of extremes to climatology

To alleviate the possible uncertainties in rainfall characteristics among the various datasets, since each dataset have some difference in capturing the spatial distribution of mean and extreme rainfall, we presume higher similarities may be exhibited by the ratio of extreme precipitation to climatology. As shown in Fig. 6, over most of the region the ratios are greater than six. However, over the maritime continent the ratios are below four. In fact the ratios are below two as depicted by the ERA and MERRA datasets. Maximum ratios exceeding 12 are displayed over central and western parts of India by most of the datasets. Satellite and rain-gauge based sources tend to show increased agreement, as compared with the ensemble data; however, ERA and MERRA show systematic biases compared with distribution of the ensemble. Even though CPC-UNI is a rain-gauge dataset, the scaled value was relatively higher in areas between latitude 30°N and 40°N. The patterns for the ratios between the 95th percentile and the climatological mean rainfall intensity are similar between APHORODITE and JRA55 and also similar between ERA and MERRA. In addition, GPCP and TRMM resemble the ratio of extremes with higher values. Extremely high ratios north of 30°N are displayed by all the data sets, in particular for the CPC data due to considerably less seasonal rain intensity over these regions.

Fig. 6
figure 6

Maps of the 95th percentile of precipitation as fraction of the climatology during the period 1998–2007 for the datasets: a ensemble mean based on the seven datasets, b GPCP, c TRMM, d APHRODITE, e CPC-UNI, f ERA-Interim reanalysis, g MERRA reanalysis, and h JRA55 reanalysis

Tables 3 and 4 show quantitative assessments of high percentile precipitation across the seven datasets for sub domains defined in Sect. 2.3. The consistency of high percentile of rainfall over the East Asia region is greater than that for India and Southeast Asia (Indochina). Over the East Asia GPCP, TRMM, and APHRODITE are in much better agreement with the ensemble as per the results of pattern correlation, standard deviation, RMSE, and skill score. For CPC-UNI, ERA-Interim, and MERRA, there is lower similarity of spatial distribution as compared with the ensemble. In particular, CPC-UNI has the largest inconsistency of high percentile precipitation and skill score (95th percentile: 0.215, 0.128), (99th percentile: 0.350, 0.161) over East Asia and Southeast Asia. Despite CPC-UNI (0.5° × 0.5°) having double the horizontal resolution of GPCP (1.0° × 1.0°), GPCP is more similar to APRODITE. This could reflect the discrepancy in number of stations, with APHRODITE using a more extensive gauge network than CPC-UNI (Gebregiorgis and Hossain 2015; Rana et al. 2015).

Table 3 Statistical analysis of reference dataset on 95th and 99th percentile precipitation over East Asia, India and Southeast Asia
Table 4 Statistical analysis of reference dataset on scaled value of 95th and 99th percentile precipitation over East Asia, India and Southeast Asia

In the scaled values of higher percentile precipitation, the consistency among all datasets improves over East Asia, India, and Southeast (Indochina) according to pattern correlation, standard deviation, RMSE, and skill score. Notably, CPC-UNI has a low skill score while in scaled value the skill score increased (95th percentile: 0.849, 0.415) (99th percentile: 0.843, 0.443). Moreover, over Southeast Asia, all datasets have lower coherency on extreme values than over East Asia, and India, especially CPC-UNI, JRA55. However, when scaled, the performance of each dataset suggests improvement. Although each dataset may contain numerous systematic biases regarding the amount of higher percentile precipitation, our analyses show that they can be better for capturing the characteristics in terms of the ratio with respect to climatology. Exceptionally, scaled values over Southeast Asia for JRA55 bring worse outcomes in pattern correlations (95th percentile: 0.726, 99th percentile: 0.755) than original values (95th percentile: 0.191, 99th percentile: 0.532) .

3.5 Signal-to-noise (SNR) ratio

Figure 7 shows uncertainty ranges of climatology, daily mean for boreal summer, 95th percentile precipitation, the right tail of precipitation (99–90th percentile precipitation), and scaled precipitation across the datasets through the SNR distributions. The SNR spatial pattern for climatology distribution appears similar to that for the mean precipitation in JJA (Fig. 7a, b), while the SNR for the extreme value (95th percentile value) appears similar to the 99th–90th percentile pattern (Fig. 7 c, d). The SNR for the differences between the 90th and 99th percentile precipitation appear smaller than the 95th percentile precipitation (compare Fig. 7c, d). In other words, the width of the right tail of the precipitation PDF may have larger uncertainties. The SNR values for the 95th percentile appear relatively higher than those of the 95th percentile precipitation over South and East Asia (compare Fig. 7c, e). Consequently, using a scaled value, systematic biases for regions with large uncertainties over heavy precipitation regions appear largely reduced. In summary the signal to noise spatial patterns, show higher ratios over East Asia compared to South and Southeast Asia. This may imply that precipitation data over East Asia are consistent among the datasets.

Fig. 7
figure 7

Signal to noise ratio (SNR) for a climatology, b mean precipitation, c 95th percentile precipitation, d 99th–90th percentile precipitation and e 95th percentile of precipitation as fraction of climatology

3.6 Interannual variability

To examine uncertainty in the interannual variability of extreme precipitation across the datasets, the standard deviation of the 95th percentile precipitation for JJA have been calculated (Fig. 8). Higher uncertainties in their interannual variability of extreme value exist among the datasets, and especially strong for heavy rain regions. TRMM shows relatively strong interannual variability, while ERA-Interim, MERRA show relatively weak variability. GPCP, TRMM, APHRODITE, CPC-UNI, and JRA55 reproduced larger interannual variability over complex topography. But, ERA shows higher variability over northeast India and neighborhood. In summary, TRMM clearly shows the largest interannual variability over South Asia as well as East Asia. The spatial pattern for GPCP is similar, however with reduced variability compared with TRMM. Similar patterns are seen for CPC and JRA55 except for maritime continents and Myanmar.

Fig. 8
figure 8

Interannual variability of the 95th percentile precipitation during summer for a GPCP, b TRMM, c APHRODITE, d CPC-UNI, e ERA-Interim reanalysis, f MERRA reanalysis, and g JRA55 reanalysis

3.7 Decadal trends

Kim and Park (2016) suggested that the linear trend in precipitation may have larger uncertainty than in the mean and in the interannual variability among the reference datasets. Figure 9 shows the decadal trends of extreme rainfall (i.e., 95th percentile precipitation for JJA) in the datasets over the period 1998–2007 calculated through a linear regression coefficient. Except for APHRODITE, ERA-Interim, and JRA55, the rest shows a remarkable increasing trend over central India, which is consistent with the finding of Rana et al. (2015) and Prakash et al. (2014). Singh et al. (2013) found that APHRODITE has opposite trend signs for extreme precipitation intensity from the 2000s onwards when compared with the IMD dataset (2140 rain-gauge station).

Fig. 9
figure 9

Decadal trends of the 95th percentile precipitation during summer for the period 1998–2007 in the datasets: a GPCP, b TRMM, c APHRODITE, d CPC-UNI, e ERA-Interim reanalysis, f MERRA reanalysis, and g JRA55 reanalysis

Decadal trends in the 95th percentile values indicate interesting results (Fig. 9). Increasing trends are observed over the central Indian region from the west to east; on the other hand, decreasing trends are observed just north over the foothills of the Himalaya. This is consistent with results shown by Preethi et al. (2017). Four of the datasets (GPCP, TRMM, APHRODITE and MERRA) are able to capture these features over the Indian subcontinent. While CPC data could capture only the increasing trend over the Indian region, ERA does not capture any of these trends. A threefold rise in extreme precipitation events over central India has been recently reported (Roxy et al. 2017; Goswami et al. 2006), Over East Asia, a wave-like pattern with increasing, decreasing, and increasing trends from north to south China are observed. Except the ERA dataset the remaining five datasets display this feature. This feature is also consistent with earlier studies (e.g. Preethi et al. 2017). Over Southeast Asia no trends are exhibited by any of these datasets. Distinctively, JRA55 has negative trends over Asian summer monsoon domain unlike the other datasets. Next we examine the outputs from the CMIP5 datasets.

4 CMIP5 analysis

Kharin et al. (2007) showed that IPCC AR5 models can well simulate extreme precipitation in the extra-tropics, but uncertainties in extreme precipitation for the tropics are very large. Scoccimarro et al. (2013) found that 99th percentile precipitation are consistently simulated over mid and high latitudes in the northern summer season, but underestimated in the tropics in terms of zonal average, as shown by CMIP5 models with a spatial resolution of ~ 100 km.

To analyze the spatial distribution of the 95th percentile of boreal summer (JJA) precipitation in the 29 AMIP-type simulations, APHRODITE was utilized as the reference data because it includes the longest record and high spatial resolution and information of station data. Results based on high-resolution models are presented in Fig. 10, for medium-resolution models in Fig. 11 and for low-resolution models in Fig. 12.

Fig. 10
figure 10

Horizontal distributions of the 95th percentile of precipitation for summer during the period 1980–2007 in APHRODITE (reference dataset) and individual CMIP5 high-resolution models

Fig. 11
figure 11

Same as Fig. 10 but for the medium-resolution models

Fig. 12
figure 12

Same as Fig. 10 but for the low-resolution models

The fine-resolution models (< 1°) reasonably well captured the spatial variations in extreme rainfall and rainfall amounts over East Asia, India, and Southeast Asia (Fig. 10). However, both the GFDL models and the CMCC model exhibit maximum intensity of the 95th percentile over the Indian subcontinent. For the medium-resolution group (Fig. 11) most of the models display maximum intensity over northeast India and South China, Finally two models of the low-resolution group (BCC and FGOALS, Fig. 12h, k) display maximum intensity over Bangladesh, Myanmar and Thailand. All three groups, in general showed common features of the 95th percentile. However the regions of maximum intensity of the 95th percentile appear to be model dependent.

4.1 Precipitation biases

The composite precipitation biases are illustrated in Fig. 13, the upper panels for the 95th percentile, the central panels for the 99th–90th percentile and the lower panels for the frequency of heavy rain events. The negative biases over northern and central parts of India amplify as we examine the high-resolution models through to the low-resolution models for the 95th percentile (Compare Fig. 13b–d). For the same statistic, similar inferences can be drawn over China. For the 99th–90th percentile the positive biases depicted by the high-resolution models change to negative biases as depicted by the low-resolution models (Compare Fig. 13f–h). Similar inferences can be drawn for the heavy rain events (Fig. 13j–l).

Fig. 13
figure 13

Biases of 95th percentile precipitation (ad); 99th–90th percentile precipitation (eh); and the number of heavy rain days (≥ 30 mm/day) (il) in CMIP5 model composites of all the models, high-resolution, medium-resolution, and low-resolution models

A known source of error in the simulation of precipitation in current models is the convective parameterization schemes. Also the increase of model resolution influences simulating mean and extreme precipitation. This is due to improved representation of complex topography and land surface processes (Kendon et al. 2012). A comparison of the results for the three resolutions in Fig. 13 confirms the results shown in Kendon et al. (2012).

4.2 Comparison of low and high percentile

Figure 14 illustrates the performances of individual models using Taylor diagrams of the 75th, 95th, and 99th percentile values against the reference. Over East Asia, most models in the high-resolution group perform better in terms of 75th, 95th, and 99th percentile precipitation than the low-resolution group, and especially for the 99th percentile. These results are consistent with those of Jiang et al. (2015), who shows that models with higher resolution produce relatively smaller errors in extreme precipitation. The low-resolution group reasonably well reproduces low percentile (i.e., 75th percentile) precipitation, as shown by the correlation, standard deviation, and RMSE. The spatial correlation in all models over Southeast Asia (Indochina) is relatively high (0.5–0.9) despite having larger ratios of variance and RMSE. Over India, there exists a relatively large spread and a lower skill of spatial correlation for all models, and especially for variability of 99th percentile precipitation for models in the low- and high-resolution groups. In contrast, the medium-resolution group shows a relatively better performance for the 99th percentile precipitation. Over Southeast Asia (Indochina) there exists outliers in 75th, 95th, and 99th percentile precipitation belonging to the low-resolution group. Among all of the models considered, the best skill scores are for CMCC-CM, MRI-AGCM3.2H, and CAM5 (all in the high-resolution group), and the worst were for FGOALS-g2 and IPSL-CM5B-LR (all in the low-resolution group).

Fig. 14
figure 14

Taylor diagrams of 75th (ac), 95th (df), and 99th (gi) percentile precipitation between APHRODITE and individual CMIP5 models over East Asia, South Asia (India), and the Southeast Asia (Indochina) for High-resolution (red), Medium-resolution (blue), Low-resolution (green)

To confirm whether similar characteristics exist for different GCMs, the distribution of mean and extreme value is evaluated statistically through pattern correlation, the spatial standardized deviation ratio, RSME, and Taylor skill score calculates on the daily mean and the 95th percentile precipitation. Extreme precipitation simulated by CMIP5 models shows higher pattern correlation than does the mean over East Asia, India, and South East Asia, but shows lower skill according to the spatial standardized deviation ratio, RSME, and Taylor skill score. Over India and South East Asia, the skill score of extreme values for the low-resolution group is noticeably lower, indicating that extreme values have a higher model resolution dependency compared with mean values, which is consistent with the findings of Volosciuk et al. (2015). Scaling the precipitation, the skill over India and Southeast Asia for the low-resolution group is remarkably improved as compared with the original value, which could imply that models with low-resolution fail to reproduce extreme rainfall owing to limitations of resolution or parameterization, but are able to moderately well capture the ratio of heavy rainfall with respect to climatology.

4.3 SNR distribution

The SNR distribution associated with mean and extreme value across the inter-models (Fig. 15) roughly appears similar to the datasets examined earlier. For daily mean and extreme rainfall over monsoon rainfall regions (i.e. Central India, East Asia, and Indochina), the SNR value for the high-resolution group is relatively higher than for the other groups, implying smaller inter-model spread. The relatively lower SNR for the medium- and low-resolution groups suggests that such models include large uncertainties. We find larger uncertainties in the right tail distribution of precipitation than we do for the daily mean and the 95th percentile precipitation over Central India, Southeastern China, Indochina, and the maritime continents. In scaled extreme precipitation, the uncertainties for all three groups are noticeably reduced over East Asia, India, and Southeast Asia as compared with the original 95th percentile precipitation. Most notably, the high-resolution group shows the largest consistency across inter-models.

Fig. 15
figure 15

SNR for mean (ac); 95th percentile of precipitation (df); 99th–90th percentile of precipitation (gi), and scaled 95th percentile precipitation (jl) for June–August (JJA) in CMIP5 model composites: high-resolution, medium-resolution, and low-resolution models

5 Summary and conclusions

We analyzed daily extreme precipitation related to the Asian summer monsoon in fine-resolution observed datasets and CMIP5 simulations. Seven reference datasets from different sources were examined to compare the precipitation extremes displayed by each of the datasets. The datasets considered were APHRODITE, CPC-UNI, TRMM, GPCP1DD, ERA-Interim, MERRA, and JRA55. Thus, the datasets contain all the information available through rain gauges, satellites and reanalysis. These datasets are based on different assimilation systems and have different horizontal resolutions, thus could be considered independent of each other. Spatial patterns generated by these datasets on several statistics related to extreme precipitation were compared. Results showed that there are robust differences among the datasets exhibiting extreme precipitation statistics. Maximum differences are observed over Southeast Asia compared to South and East Asia. From SNR analysis, mean and extreme rainfall over large topographic variations include more uncertainties, however they appear relatively more consistent over the East Asia. Hence, whether an ensemble mean is an appropriate choice could be still debatable.

Through the simulation data outputs of 29 CMIP5 models, we investigated whether GCM performance can realistically represent extreme precipitation and whether uncertainty in GCMs is affected by horizontal resolution. We divided the models in three categories, high, medium and low resolutions. In the high-resolution group, the bias distributions of 95th percentile precipitation and the frequency of extreme precipitation showed generally smaller magnitudes than for the other two model groups. In particular, India (latitude 20–30°N) and Southeastern China were found to be the most dependent on horizontal resolution. As shown in Taylor diagram analysis, the performance of the high-resolution group was relatively higher over East Asia, but lower over India. Several models in the low-resolution group produced outliers in high percentile precipitation. Among the sub-domains, East Asia had the best performance. SNR analysis provided better agreement across the models in the high-resolution group in terms of mean, 95th percentile precipitation, and the right tail of precipitation than it did for the medium- and low-resolution groups over Central India and East Asia. Just like the seven datasets, the scaled 95th percentile precipitation for JJA was more consistent than the distribution of the original 95th percentile value among the model results. In summary, there are robust differences among the model outputs with respect to the simulation of extreme precipitation. In model results, this is expected and not surprising. Hence defining a multi-model ensemble mean could also be debatable.

Our studies have focused on the impacts of horizontal resolution with regard to extreme precipitation over Asia monsoon region, but the contribution of model resolution still remains debatable. Especially previous studies have demonstrated physical parameterization scheme of climate models also influences simulating precipitation (Im et al. 2008; Endo et al. 2012; Ali et al. 2015). Several studies have argued increasing horizontal resolution dramatically does not improve a climatology of large-scale (Bacheister et al. 2013; Wehner et al. 2014). Nevertheless, our results showed high resolution simulations reproduce more reasonable daily extreme rainfall over Asian summer monsoon.