1 Introduction

The rapid development of global climate change and urbanization has caused the frequent occurrences of extreme events in recent years (Wu 2017). From 18:00 on July 18 to 00:00 on July 21, 2021 (Beijing time, the same below), influenced by the combined effects of terrain and water vapor which is transported by typhoon “fireworks,” heavy rains across large parts of Henan Province, especially torrential rains, hit Zhengzhou, Xuchang, and Xinxiang cities (Central Meteorological Observatory 2021a, b). The daily rainfall of 10 meteorological stations in Henan Province has been breaking the historical extreme value since the record began. At Zhengzhou City, the maximum 24-h accumulated rainfall reached 624.1 mm, exceeding the annual accumulated rainfall of 2019 (509.5 mm) (Central Meteorological Observatory 2021a, b). Shaoguan is a typical city in South China, in which mountains and hills are the most widely distributed and average annual rainfall is over 1500 mm. During the flood season, it is often attacked by continuous rainstorms and the flooding, flush flood, and debris which occurred frequently. These disasters have brought huge losses to the safety of people’s lives and the social economy every year. From 08:00 on June 8 to 08:00 on June 8, 2020, there were heavy rains across central and southern Shaoguan and the local parts were attacked by torrential rains. In Qujiang County, the daily rainfall reached 460.7 mm and had refreshed the historic record of daily rainfall in Shaoguan City (China Meteorological Administration 2020). The extreme precipitation caused road collapses, houses flooded, and walls collapsed in part of this city, and traffic was disrupted. Therefore, it has great practical significance to perform long-duration 24-h precipitation frequency analysis in Shaoguan, with compiling a 24-h rainfall frequency atlas which can provide scientific suggestions for city management in carrying out classification prevention and rescues of extreme precipitation disaster events.

There are two core issues in hydrological frequency analysis: the precise estimate of parameters and the accuracy of quantiles. Simultaneously, two major difficulties are existent: it is impossible to derive the optimal frequency distribution function of the rainfall data through theoretical analysis, and the true value of the quantiles is unknown (Lin 2010a, b; Lin et al. 2012). At present, two approaches are frequently used to perform the frequency analysis of precipitation, at-site and regional. Since at-site frequency analysis needs a collection of long periods of records, the reliability of estimates is usually hampered by the insufficient length of recorded data, especially when estimating rainfall of large return periods (Bernard et al. 1995). On the contrary, regional frequency analysis can improve the accuracy of quantile estimates by incorporating spatial data from several sites having similar characteristics in a homogeneous region (Hosking and Wallis 1998). The index-flood method proposed by Hosking and Wallis (1993) is widely used with L-moments approach to undertake regional frequency analysis. It is reported that L-moments have the theoretical advantages over conventional moments of being more robust to the presence of outliers in the data and being less subjective to biasness in the estimation of parameters (Hosking and Wallis 1998; Lin et al. 2006; Lin 2010b; Liang et al. 2013). Later, by applying the concept of regional analysis, Lin et al. (2006) and Lin (2010b) separated a rainfall into common component and local component, in the practices of precipitation frequency analysis for the NOAA Atlas 14 in the USA (Lin et al. 2006; Lin 2010b; Bonnin et al. 2012) as well as for the Taihu Lake Basin in China (Lin 2010b; Wu et al. 2015). Other studies abroad have also been done using the regional L-moments approach which combines the regional analysis with the L-moments for rainfall frequency analysis (e.g., UK (Fowler and Kilsby 2010), Italy (Norbiato et al. 2007), Norway (Hailegeorgis et al. 2013)) and many regions of China (e.g., Guangxi (Chen et al. 2014), Yangtze River Delta region (Yin et al. 2016), Huaihe River Basin (Shao et al. 2016), Jiangxi (Ding et al. 2018; Liu et al. 2018), Sichuan (Li et al. 2019), and Xiamen (Shang et al. 2019)). The research results indicated that the regional L-moments method is able to solve the problems related to identification of homogeneous regions, testing and selection of the best-fitting distribution, as well as estimations of parameters and quantiles at places of interest.

In general, L-moments methods are usually used with observational rainfall data from stations, but the gauges in Shaoguan are distributed unevenly and sparsely. Most of the satellite-based precipitation products appeared after 2000, as a new type of precipitation data, there are little researchers which combined satellite precipitation data and L-moments methods in frequency analysis. In this study, Integrated Multi-satellitE Retrievals for Global Precipitation Mission (IMERG) gridded rainfall data is used as an example to explore and evaluate the applicability of satellite precipitation data in the univariate L-moments method which has been used in many studies and thus relatively mature (Jing et al. 2014; Modarres 2010; Sarmadi and Shokoohi 2014). The GPM products can provide precipitation data with higher accuracy, larger coverage, and higher spatial and temporal resolution, as well as improve the detection accuracy of precipitation; therefore, it is also expected the results can complement the shortage of rainfall stations in the study area.

The contents of this paper are as follows: Section 2 introduces the study area and data sets used. Following that, the theoretical method of regional L-moments analysis (RLMA) and mean bias correction (MBC) are described in Section 3. Later, the results and discussion are given in Section 4, and the main conclusions are summarized in Section 5.

2 Study area and data

2.1 Study area

The target area, Shaoguan City, lies off the north of Guangdong Province with the geographic coordinates between \(112^\circ 50\mathrm{^{\prime}}\) and \(114^\circ {45}^{\mathrm{^{\prime}}}\mathrm{ E}\) longitude and \(23^\circ 5\mathrm{^{\prime}}\) and \(25^\circ 3\mathrm{^{\prime}}\mathrm{ N}\) latitude and covers a total area of about 18,400 \({\mathrm{km}}^{2}\). According to the statistics, the average 24-h annual rainfall of Shaoguan City is 1695.1 mm over the past 30 years (from 1991 to 2020), while the air temperature is 20.3 ℃. As the elevation map shows (Fig. 1), two-thirds of the city is occupied by mountains which are mostly located in the western, while plains are mainly distributed in the central part. Three main mountain ranges run approximately in parallel from northwest to southeast, including Yaoling, Huashi and Qingyun Mountain Range, with a maximum elevation of 1902 m.

Fig. 1
figure 1

Elevation map of Shaoguan City, and locations of the rain gauge stations

2.2 Data and sources

2.2.1 Observational precipitation data

The observational precipitation data from 8 meteorological stations and 62 hydrological stations provided by Guangdong Hydrology Bureau and Shaoguan Meteorological Bureau was used in this study. The distribution of these stations is shown in Fig. 1, where 39 stations are distributed in the “buffer zone” within 30 km around Shaoguan City in order to improve the reliability of frequency estimation in the edge of the study area. The data includes hydrological 24-h annual maximum precipitation and meteorological hourly precipitation. The effective data length was selected as 20 years (from 1980 to 2020) to ensure the reliability of the calculation results. After review, the length of data of each station is greater than 20 years, meaning that 70 stations are available.

2.2.2 IMERG rainfall data

One of the major problems in precipitation frequency analysis in Shaoguan area is the stations which have long-term records of reliable rainfall data are distributed unevenly and sparsely. The in situ–based gridded rainfall data can be used in the region where long-term reliable precipitation observation is not available because of their spatial and temporal continuity (Ahmed et al. 2019). In this paper, the IMERG satellite gridded rainfall products provided by the National Aeronautics and Space Administration (NASA) were used for research and analysis. The data was selected from June 1, 2000, to December 31, 2020, with the coordinates between 112° 36′ and 115° E longitude and 23° 42′ and 25° 48′ N latitude. Besides that, the time resolution of data is 1 day while spatial resolution is \(0.1^\circ \times 0.1^\circ\). Existing studies have shown that this data set has excellent adaptability in China, with high spatial and temporal resolution and global coverage (Chen et al. 2017; Wu et al. 2019). The experimental results also showed that in Shaoguan and its surrounding areas, the correlation coefficient between precipitation observation and corresponding IMERG rainfall on daily and annual scales is 0.7 ~ 0.8, especially exceeding 0.9 on seasonal scale. Therefore, the IMERG data is reasonable and can be applied to this study.

3 Methodology

3.1 Regional L-moments analysis

The RLMA includes the following steps which mainly use an index-flood procedure: identification of homogeneous regions, parameter estimation, choice of an appropriate probability distribution, and estimation of quantiles. The theory of RLMA and the methods used in these steps are briefed in this section.

3.1.1 L-moments approach

L-moments are an alternative system to summarize the statistical properties of the hydrologic data and to describe the shape of the probability distributions. L-moments arose as modifications of the probability-weighted moments (PWMs) of Greenwood et al. (1979) and are defined by Hosking (1990) as expectations of certain linear combinations of order statistics. Letting \({X}_{1:n}\le {X}_{2:n}\le \dots \le {X}_{n:n}\) be the order statistics of a random sample of size \(n\) drawn from the distribution of \(X\), the L-moments of a probability distribution are defined by Hosking et al. (1998):

$$\begin{array}{c}{\lambda }_{1}=E({X}_{1:1})\\ {\lambda }_{2}=\frac{1}{2}E({X}_{2:2}-{X}_{1:2})\\ {\lambda }_{3}=\frac{1}{3}E({X}_{3:3}-2{X}_{2:3}+{X}_{1:3})\end{array}$$
$${\lambda }_{4}=\frac{1}{4}E({X}_{4:4}-3{X}_{3:4}+3{X}_{2:4}-{X}_{1:4})$$
(1)

In general, the \(r\) th L-moment of variable \(X\) is

$${\lambda }_{r}={r}^{-1}\sum\nolimits_{k=0}^{r-1}{\left(-1\right)}^{k}\left(\genfrac{}{}{0pt}{}{r-1}{k}\right)E\left({X}_{r-k:r}\right),r=1, 2,\dots$$
(2)

where \(E({X}_{r-k:r})\) is the \((r-k)\) th order statistics from a sample size of \(r\).

Hosking and Wallis (1998) defined the L-moment ratios as follows:

$$\begin{array}{c}\mathrm{Coefficient}\;\mathrm{of}\;L-\mathrm{variation}\;(L-C_v):\tau_2=\lambda_2/\lambda_1\\\mathrm{Coefficient}\;\mathrm{of}\;L-\mathrm{skewness}\;(L-C_s):\tau_3=\lambda_3/\lambda_2\\\mathrm{Coefficient}\;\mathrm{of}\;L-\mathrm{kurtosis}\;(L-C_k):\tau_4=\lambda_4/\lambda_2\end{array}$$
(3)

3.1.2 The index-flood procedure

The key assumption of an index-flood procedure is that the \(N\) sites form a homogeneous region and the frequency distributions of the \(N\) sites are identical apart from a site-specific scaling factor (Hosking and Wallis 1993). The index-flood actually is a location estimator which is commonly the at-site sample mean \({\overline{x} }_{i}\). The frequency values (\({q}_{Tj}\)) at several desired return periods (\({T}_{j}\)) of the dimensionless regional distribution are called regional growth factors (RGFs). The quantiles for return periods (\({T}_{j}\)) at site \(i\) can be written as (Hosking and Wallis 1993):

$${Q}_{Tj,i}={\overline{x} }_{i}{q}_{Tj},j=2 \mathrm{years}, 5 \mathrm{years},\dots ,100 \mathrm{years},\dots , 1000 \mathrm{years}$$
(4)

The \({q}_{Tj}\) can be determined by a set of regional parameters for a selected distribution. The regional parameters are weighted average values over \(N\) sites, with the parameter estimate of site \(i\) given weight proportional to the length of record, \({n}_{i}\). The at-site parameter estimate is obtained via the L-moments method based on rescaling data at site \(i\) to its mean \({\overline{x} }_{i}\).

3.1.3 Identification of homogeneous regions

The aim of identifying homogeneous regions is to form groups of sites that approximately satisfy the homogeneity condition. In the homogeneous region, the statistical parameters (\(L-{C}_{v}\), \(L-{C}_{s}\), and \(L-{C}_{k}\)) of the sites are consistent within a tolerance. The measure to judge the degree of homogeneity of the region is (Hosking and Wallis 1998):

$${H}_{1}=\frac{{V}_{L-Cv}-{\overline{V} }_{L-Cv}}{{\sigma }_{L-Cv}}$$
(5)

where \({V}_{L-Cv}\) is the variation of the \(L-{C}_{v}\) over stations in a region. So, \({H}_{1}\) is the standardized \({\overline{V} }_{L-Cv}\). Hosking and Wallis (1993) suggested the region can be considered as “acceptably homogeneous” if \({H}_{1}<1\). The effect of \(L-{C}_{s}\) on the formation of homogeneous regions is also regarded.

3.1.4 Selection of an appropriate probability distribution

In each region, only the best distribution is applied to undertake regional frequency analysis for better and more accurate calculation result. Three goodness-of-fit tests are used to select the best distribution from five commonly used three-parameter frequency distributions: generalized logistic (GLO), generalized extreme value (GEV), generalized normal (GNO), generalized Pareto (GPA), and Pearson type III (PE3).

  • Test 1: the Monte-Carlo simulation test

The goodness of fit is judged by the deviation from the regional average of \(L-{C}_{s}\) and \(L-{C}_{k}\) of the observed data to the \(L-{C}_{s}\) and \(L-{C}_{k}\) of the fitted distribution. For each distribution, the measure is (Hosking and Wallis 1998):

$${Z}^{\mathrm{DIST}}=({\tau }_{4}^{\mathrm{DIST}}-{t}_{4}^{R}+{\beta }_{4})/{\sigma }_{4}$$
(6)

where DIST refers to the candidate distribution; \({t}_{4}^{R}\) is the average \(L-{C}_{k}\) value computed from the data of a given region; \({\tau }_{4}^{\mathrm{DIST}}\) is the \(L-{C}_{k}\) of the fitted distribution to the data using the candidate distribution; \({\beta }_{4}\) and \({\sigma }_{4}\) are the bias and standard deviation of the regional average \(L-{C}_{k}\), respectively; and \({t}_{4}^{R}\) is the Monte-Carlo simulation samples by kappa distribution.

For a confidence level of 90%, a distribution is acceptable if \(\left|{Z}^{\mathrm{DIST}}\right|\le 1.64\). Among accepted distributions, the distribution with the smallest \(\left|{Z}^{\mathrm{DIST}}\right|\) is the most appropriate \(L-{C}_{k}\) distribution.

  • Test 2: root mean square error of the sample L-moments

In this test, a weighted root mean square error (RMSE) calculated for each of the candidate distributions serves as an index. It is expressed by Lin et al. (2006):

$$\mathrm{RESE}={(\sum\nolimits_{i=1}^{N}{n}_{i}{\left({S}_{i,L-Ck}-{D}_{i,L=Ck}\right)}^{2}/\sum\nolimits_{i=1}^{N}{n}_{i})}^\frac{1}{2},i=1, 2,\dots ,N$$
(7)

where \({n}_{i}\) is the data length at site \(i\), and \({S}_{i,L-Ck}\) and \({D}_{i,L=Ck}\) are the sample \(L-{C}_{k}\) and the distribution’s \(L-{C}_{k}\) at site \(i\), respectively. The distribution with the smallest RMSE is the most appropriate distribution.

  • Test 3: real data check test

Having had a quantile estimate from a fitted distribution at a given return period (\({T}_{j}\)) based on the real data series, an empirical exceedance frequency (\({F}_{i,{T}_{j}}\)) to the quantiles can be calculated. Then, \({F}_{i,{T}_{j}}\) is compared with its corresponding theoretical probability (\({P}_{{T}_{j}}\)). A relative error (RE) can be calculated over several return periods to reflect the degree of the match between the empirical frequencies and the theoretical probabilities at site \(i\). The smaller the RE, the better the fitting will be. Then, the regional average RE calculated over N sites in a homogeneous region can be used as an index for the goodness of fit (Lin et al. 2006):

$$\mathrm{RE}= \sum\nolimits_{j}\sum\nolimits_{i=1}^{N}\frac{({{F}_{i,{T}_{j}}-P}_{{T}_{j}})}{{P}_{{T}_{j}}}/\sum\nolimits_{j}\sum\nolimits_{i=1}^{N}1, i=1, 2,\dots ,N;j=2 \mathrm{years},5 \mathrm{years},\dots , 100 \mathrm{years}$$
(8)

For each return period, REs over the five three-parameter frequency distributions mentioned above are ranked from large to small, and the total ranking number over 2 years, 5 years, 10 years, 25 years, and 50 years for each distribution is taken as an index, called RE score, to evaluate the goodness of fit of the distribution. The higher the RE score, the better the fitted distribution will be.

The above three goodness-of-fit tests are performed for each homogeneous region, and then, a final decision to choose the most appropriate distribution can be made based on a summary of the test results.

3.2 Mean bias correction

The fundamental idea of the MBC method is to use the total error of the estimates for desired return period which are calculated respectively based on IMERG rainfall and observational rainfall, to correct the former. This method assumes that in the same homogeneous region, the quantiles obtained from IMERG data have uniform deviation (Gjertsen et al. 2003). For each homogeneous region, the correction factor B can be calculated by the following formula:

$$B=\frac{\sum_{i=1}^{n}{P}_{si}}{\sum_{i=1}^{n}{P}_{gi}}$$
(9)

where n is the number of rain gauge stations in the homogeneous region, and \({P}_{si}\) and \({P}_{gi}\) is the estimate for desired return period at station and at adjacent grid, respectively.

4 Results and discussion

4.1 Consistency test of hydrological and meteorological precipitation data

In this study, the annual maximum series (AMS) approach was used to construct 24-h annual maximum precipitation (AMP) at stations and grids. For IMERG data, daily rainfall should be converted to 24-h precipitation first. By comparing the annual maximum value of daily and 24-h precipitation at 8 meteorological stations, the average amplification factor of IMERG daily rainfall was determined to be 1.17.

For precipitation observation, after getting 24-h AMP at each station, data quality control should be carried out for examining whether precipitation data from meteorological and hydrological stations have same distribution and can be used together in frequency analysis. Therefore, 6 groups of hydrological and meteorological rainfall stations with close distances were selected (59,776–40,046, 57,988–40,068, 59,082–40,075, 59,090–40,045, 59,081–40,074, 59,094–40,083), and the statistical consistency of precipitation data between members of each group was tested by using t test method (α = 0.05). The result showed that 6 groups all have the same distribution at the significance level of 5%.

4.2 Parameter estimation

Based on the 24-h AMP at stations and grids, the statistical parameters such as mean, \({C}_{v}\), \({C}_{s}\), and \({C}_{k}\) are calculated with the L-moments method, and their spatial distribution map is drawn in Fig. 2. The range of mean of AMP at sites is 93.32 ~ 155.63 mm, while that at grids is 101.22 ~ 169.15 mm. The maximums of mean are mainly distributed in the southern plains, while the extreme rainfall in the northwestern mountainous region is relatively small. The \({L-C}_{v}\) at sites showed that the AMP from observational rainfall has a large degree of dispersion in western Shaoguan, and the maximum zone of \({L-C}_{v}\) at grids is mainly located in the northwest corner, southwest corner, and the east-central mountainous region; that is, the AMP from IMERG rainfall in these areas has more maximum or minimum precipitation. The \({L-C}_{s}\) can reflect the frequency distribution characteristics of the AMP flag values while \({L-C}_{k}\) describes the steepness of the peak of AMP distribution. Domestic and overseas studies have presented that there is a good positive correlation between \({L-C}_{s}\) and \({L-C}_{k}\) (Hosking and Wallis 1998). The spatial distributions of \({L-C}_{s}\) and \({L-C}_{k}\) at sites are generally similar, indicating that the western mountainous region of Shaoguan has more and stronger extreme precipitation. However, the spatial distributions of \({L-C}_{s}\) and \({L-C}_{k}\) at grids are quite different, but both have a maximum zone in the northwestern plain, meaning that the distributions of AMP from IMERG have the characteristics of steep peak and thick tail in this area. In addition, the maximum value of \({L-C}_{s}\) at grids is also located in the southwest corner and the east-central mountainous region, which corresponds to the maximum of \({L-C}_{v}\) at grids, indicating that there has extreme precipitation which is far exceeding the average of AMP from IMERG.

Fig. 2
figure 2

Spatial distribution of mean, \({L-C}_{v}\),\({L-C}_{s}\), and \({L-C}_{k}\) at sites or grids for 24-h AMP in Shaoguan area

The five curves in Fig. 3 represent the theoretical curves of \({L-C}_{s}\) and \({L-C}_{k}\) of five commonly three-parameter distributions, and the point distance of these two parameters at 70 stations or 504 grids is also plotted in Fig. 3. It can be seen that when considering the Shaoguan area as one homogeneous region, the spot of stations or IMERG is scattered, indicating that the same distribution cannot be chosen for the 24-h extreme rainfall of all stations or grids. Therefore, Shaoguan should be divided into several homogeneous regions in the next step.

Fig. 3
figure 3

The theoretical curves of \({L-C}_{s}\) and \({L-C}_{k}\) of five commonly three-parameter distributions and the point distance of parameters at 70 stations and at 504 grids

4.3 Homogeneous region of Shaoguan

4.3.1 According to 24-h AMP from observational rainfall

Based on the criteria for identifying homogeneous regions mentioned above, Shaoguan area was delineated into four subregions for 24-h AMP from precipitation observation (Fig. 4). Table 1 presents the homogeneity measure (\({H}_{1}\)) and the goodness-of-fit test results of each subregion. The grouping is reasonable in terms of homogeneity because all the values of \({H}_{1}\) are less than 1. In each test, the distributions ranked at the top place in performance of the fit can be considered as the appropriate ones. Taking the three goodness-of-fit test results into consideration, the best-fit distributions for each homogeneous region are determined. Ultimately, the distribution which best fitted for four homogeneous regions is GEV.

Fig. 4
figure 4

Sketch of 24-h homogeneous regions of Shaoguan and the best-fit distributions for homogeneous regions according to 24-h AMP from observational rainfall

Table 1 Numbers of sites, values of homogeneity measure (H1), and goodness-of-fit test results, as well as the best-fit distributions for 24-h homogeneous regions of Shaoguan according to observational rainfall data

4.3.2 According to 24-h AMP from IMERG rainfall

Firstly, referring to the results in Section 4.3.1, same homogeneous regions were used for IMERG. The homogeneity measure (\({H}_{1}\)) of each subregion is presented in Table 2. Although the PE3 distribution of region 1 and the GEV, GNO, and PE3 distribution of region 3 past the Monte-Carlo simulation test (the value of \(\left|{Z}^{\mathrm{DIST}}\right|\) of these distributions does not exceed 1.64), but the value of \({H}_{1}\) of region 1 was greater than 1 while that of region 3 was less than − 3, indicating that region 1 did not pass homogeneous test and region 3 had a strong correlation between the extreme precipitations at grids. The homogeneity of region 2 and region 4 was reasonable, but five commonly used frequency distributions all did not pass the Monte-Carlo simulation test, meaning that it is impossible to select a suitable frequency distribution for these two regions. In addition, because of the sparse distribution of stations, there are fewer stations at the boundaries of four regions. It is uncertain whether to move the grids near the boundary into the region, which affects the final frequency estimation results. Therefore, the homogeneous region of Shaoguan should be re-divided according to the statistical parameters from IMERG rainfall.

Table 2 Numbers of grid, values of homogeneity measure (\({H}_{1}\)), and Monte-Carlo simulation test results

Except the criteria mentioned above, it is also necessary to ensure that the rain gauge stations exist in each subregion delineated, for correcting the quantiles at grids with the calculation result at sites later. Finally, 344 IMERG grids were used in this process and Shaoguan area was delineated into 18 subregions for 24-h AMP from IMERG rainfall (Fig. 5). Table 3 also presents the homogeneity measure (\({H}_{1}\)) and the best-fit distributions were selected for each homogeneous region. All the values of \({H}_{1}\) are less than 1, most of which are between − 2 and − 3 because of the strong correlation of gridded rainfall data. Furthermore, the best-fit distributions used for 18 regions were different, such as region 14 and adjacent region 5. In Section 4.2, \({L-C}_{s}\) and \({L-C}_{k}\) at grids have indicated that the frequency distributions of extreme precipitation in northwestern plain have the characteristics of steep peak and thick tail, which makes region 5 select GEV distribution. For region 14, the mean of AMP at grids of this region is close to that of region 5, but \({L-C}_{v}\) and \({L-C}_{s}\) are larger while \({L-C}_{k}\) is smaller according to Fig. 2, indicating that the AMP at grids of region 14 is scattered and has larger extreme precipitation; moreover, the peak of distribution is gentle. Thus, the distribution of extreme precipitation in region 14 has a longer and thinner tail, which makes region 14 select GPA distribution. Similarly, the best-fit distribution of other regions was also related to the characteristics of their statistical parameters.

Fig. 5
figure 5

Sketch of the 24-h homogeneous regions of Shaoguan and the best-fit distributions for the homogeneous regions according to 24-h AMP from IMERG rainfall

Table 3 Numbers of grid, values of homogeneity measure (H1), and goodness-of-fit test results, as well as the best-fit distributions for 24-h homogeneous regions of Shaoguan according to IMERG rainfall data

4.4 Rainfall quantiles in Shaoguan area

The rainfall quantiles over 24 h at recurrence intervals of 1 to 10,000 years at sites or grids in Shaoguan area were obtained by using the index-flood procedure. Two calculation results were plotted and compared to each other (Fig. 6). On the whole, the quantiles at grids are slightly higher than those at sites if the return period is less than 40 years, but the situation is opposite over 40 years and the difference between them will increase as return period increases. According to statistics, the mean of 24-h AMP from observational rainfall or IMERG data is close, but the \({L-C}_{s}\) and \({L-C}_{k}\) at stations (0.22 and 0.18) are larger than those that at grids (0.17 and 0.12), indicating that the extreme precipitation from observation rainfall is more and stronger than that from IMERG data, possibly because the GPM satellites have strong ability to capture weak precipitation but are poor to detect heavy rainfall. The peak of distribution of stations is steeper, and the tail is thicker than that of grids, which results in larger estimates of stations at higher return periods. Therefore, quantiles at grids should be corrected for reducing their estimation error.

Fig. 6
figure 6

Comparison of quantiles at sites and at grid

In this paper, the quantiles at grids at recurrence intervals of 2 years were corrected via the MBC method, and the revised results were checked by two approaches. The first is to perform actual data inspection and analysis on the adjusted frequency estimation. For site 59,097 with the longest observation record in the study area, the length of 24-h AMP is 41 a and the maximum observed rainfall is 306.5 mm. The 2-year estimate at adjacent grid before and after correction is 149.78 mm and 131.93 mm, respectively, with the empirical frequency of 0.31 and 0.45 according to factual data of site 59,097. Obviously, the corrected empirical frequency is closer to 0.5 which is the theoretical probability of 2-year return period, indicating that the adjusted frequency estimation is more reasonable. Similarly, the empirical frequency at other grids has also been improved. The second approach is to compare the contour map of 2-year estimates before and after correction. From Fig. 7, the frequency estimation in the central and southern Shaoguan had been significantly decreased after the revision but the precipitation maximum center has not changed. Meanwhile, the precipitation minimum center in east-central and southeastern is more prominent, in which the former did not appear at sites. The corrected results combine the characteristics of quantiles at sites and at grids, meaning that the spatial distribution of frequency estimation had been appropriately adjusted.

Fig. 7
figure 7

The contour plots of 2-year estimates for 24 h at sites (a) and at grids (before correction (b) and after correction (c))

Finally, the distribution map of 100-year, 200-year, 500-year, and 1000-year rainfall estimates over 24 h which have been corrected is presented in Fig. 8. It is noticeable that the quantiles increase with the increase of return period, and the spatial distribution of estimates for different return periods is approximately the same, showing a characteristic that estimates in the south were larger than those in the north. The maximum rainfall estimates occur in the southwest to the central, as well as in the west-central, and the minimum value is distributed in the northeast. Actually, the orientation of prevailing moisture jet in Shaoguan is from south to north; thus, heavy rain is often caused in the southwest to central because of the local topographic forcing. The estimation results are proved to be consistent with reality and are conducive to providing a scientific basis for flood control decisions in the future.

Fig. 8
figure 8

The distribution map of rainfall quantiles over 24 h at recurrence intervals of 100 years (a), 200 years (b), 500 years (c), and 1000 years (d) after correction

5 Summary and conclusions

In this study, 24-h AMP respectively from observational precipitation and IMERG gridded rainfall was used to estimate 1- to 10,000-year rainfall quantiles for 24 h in Shaoguan area via RLMA. Then, the estimates at grids were compared with that at sites and corrected based on the latter by the MBC method. The findings as well as some critical questions that should be analyzed and/or discussed in further studies are summarized as follows:

  1. 1.

    By using the t test method, the precipitation observation from meteorological stations and hydrological stations is proved to have the same distribution and can be used together in frequency analysis.

  2. 2.

    The mean of AMP at grids is slightly larger than that at stations, the maximums of them are mainly distributed in the southern plains. The \({L-C}_{v}\) at stations shows that the AMP from observational rainfall has a large degree of dispersion in western Shaoguan, while the maximum zone of \({L-C}_{v}\) at grids is also located in the northwest corner, southwest corner, and the east-central mountainous region. The \({L-C}_{s}\) and \({L-C}_{k}\) at sites or grids both indicate that northwestern Shaoguan has more and stronger extreme precipitation.

  3. 3.

    For 24-h AMP from observational precipitation, Shaoguan was delineated into 4 regions because the stations used were distributed unevenly and sparsely. If the same regions are used for 24-h AMP from IMERG rainfall, one of the regions has strong correlation, and two of the regions cannot select suitable distribution. Based on the statistical parameters at grids, the number of homogeneous regions for IMERG is determined to be 18 finally.

  4. 4.

    For stations, all regions select GEV as the best-fit distribution. But for IMERG, the frequency distributions chosen for regions are different. For example, region 5 selects GEV and region 14 selects GPA because \({L-C}_{v}\) and \({L-C}_{s}\) of region 14 are larger while \({L-C}_{k}\) is smaller, indicating that the distribution for region 14 has thinner tail.

  5. 5.

    Compared with the estimates at sites, quantiles calculated from IMERG rainfall were underestimated if return period is greater than 100 years, and the difference will increase as return period increases. The reason is that the \({L-C}_{s}\) and \({L-C}_{k}\) at sites are larger than those at grids, which means that the extreme precipitation from observation rainfall is more and stronger than that from IMERG data. Thus, GPM satellite may be poor to detect heavy rainfall.

  6. 6.

    Taking 2-year estimates at grids as an example, the MBC method was proved to combine the characteristics of the quantiles at sites and at grids well, with the adjusted frequency estimation that is more reasonable and the precipitation center having no change.

  7. 7.

    The spatial distribution of 100-year, 200-year, 500-year, and 1000-year revised rainfall quantiles over 24 h at grids is similar. The rainstorm high-risk area of Shaoguan is in the southwest to the central, as well as in the west-central, which should be paid special attention in disaster prevention and reduction.

At present, there are still many issues regarding the density and coverage of rain gauges that affect the accuracy of univariate L-moments method, especially in remote areas of China or most parts of the word. Therefore, the main purpose of this study is to explore and evaluate the satellite precipitation data applied in the univariate L-moments method. The result shows that the frequency estimation calculated by satellite products with the univariate L-moments method is applicable and more reasonable. In the future, carrying out more further studies on multivariate L-moments which had replaced the univariate one and attracted the attention of many researchers all over the world today should be considered (Chebana and Ouarda 2007; Chebana et al. 2009; Wu et al. 2017; Zhang et al. 2015), using satellite precipitation data, as well as other parameters that affect the rainfall, such as the temperature and relative humidity.