Background

Extreme hydro-meteorological occurrences, such as heavy rainfall, floods, storms and typhoons are regarded as being the most costly natural disaster risk and leading research hotspots, bearing wide scope of scientific applications relevant to the field of hydrology and water resources engineering (Bruce 1994; Obasi 1994). The coastal parts of East Asia are extensively and continuously hit by climatic disasters and leaving substantial effects on the hydrological functions (Jun 1989; Shabri et al. 2011; Chang et al. 2012; Cai et al. 2014). Driven by this, the study attempts to carry out a reliable estimation of extreme rainfall occurrences and corresponding regional frequency analysis (RFA) using the L-moments approach, so as to ensure efficient design and control of the hydrological systems of South Korea.

The Hancheon catchment of Jeju Island, South Korea has been considered as the study domain, which covers an area of 37.39 sq. km, exhibiting dynamic and distinct hydrological characteristics. Over the years, the Jeju Island experienced several typhoon events and especially in the last decade, a number of typhoons such as, Typhoon Nari, Khanun, Borlaben, Sanba and Nakri hit the Island, killing a total number of 11 people and causing property damages worth around 1.41 million USD ($).

The regional frequency analysis is a continuously developing insight used by local disaster management departments, while many researchers also find the approach as a contemporary method to define identical hydrological regions, viewing it as a modification over the typical probability moments (e.g. Bradley 1998; Parida et al. 1998; Fowler and Kilsby 2003; Kumar and Chatterjee 2005; Wallis et al. 2007; Noto and Loggia 2009; Saf 2009; Shahzadi et al. 2013; Devi and Choudhury 2013; Liu et al. 2015). Um et al. (2010) studied five distribution models to examine extreme rainfall events in Jeju Island using elevation and geographic coordinates as modeling inputs. The study discussed multiple non-linear form, linear regressions, an intensity-duration-frequency (IDF) relationship curve and obtained model accuracies in the range of 18–86%. There has been a number of identical studies up to now, exploring and updating different working methods for RFA of extreme rainfall occurrences, the most prominent ones are Cluster analysis (Easterling 1989; Venkatesh and Jose 2007), L-moments analysis (Hosking 1990), L-moments associated with cluster analysis (Schaefer 1990; Guttman 1993; Wallis et al. 2007; Satyanarayana and Srinivas 2008), spatial correlation analysis (Gadgil and Yadumani 1993), homogeneity test (Wiltshire 1986) and regional frequency analysis techniques (Eslamian and Feizi 2007; Ngongondo et al. 2011; Hossein and Arash 2014; Zhang and Hall, 2004). Due to the data shortage and hydro-meteorological complexity, the L-moment approach for RFA was carried out to ensure efficient estimation of extreme rainfall occurrences, taking account of the spatial variability of the study area.

The primary intent of this study is to carry out an efficient RFA of extreme rainfall by applying the L-moment approach initially developed by Hosking and Wallis (1997). The specific aims of this research study are as follows: a) to carry out a RFA method of 6-h, 12-h, 24-h maximum consecutive rainfall series using L-moment approach, b) to evaluate the accuracy of design rainfall and reliability on goodness-of-fit test by consecutive hour rainfall (likely to be more accurate than daily and monthly rainfall), and c) to provide an appropriate estimation with 90% confidence intervals for the uncertainty analysis. This information perceived from the study may provide useful probability distribution upshots for extreme rainfall events.

Study area and data description

The Hancheon catchment encompasses the geographic coordinates of 32°54′ to 33°31′ in the north (latitude) and 126°30′ to 126°33′ along the east (longitude) (Fig. 1). The catchment area covers only 2% of the entire Jeju Island, but its orographic condition significantly influences the variability of the rainfall. It has a varying altitude ranging from 150 m to 1950 m above the mean sea level, with a mean slope of 10.8 degree (Kar et al. 2015). The major stream of the catchment is Hancheon stream, originating at the Hallasan Mountain and flowing from south to north into the ocean. The stream offers a significant control volume which controls vital role during the runoff after continuous rainfall (Yang et al. 2015). The weather of Jeju Island shows seasonal variation due to monsoon climate. About 43% of the total annual rainfall occurs in summer (June to August) and autumn (September to November). Every year, typhoon events result from extreme consecutive hour rainfalls and tropical wind to cause flash flood.

Fig. 1
figure 1

Location of Hancheon catchment in Jeju Island and rainfall stations

To account for the spatial and temporal variability of rainfalls, the Korean Meteorological Administration (KMA) of Jeju province collects hindcast meteorological data across the Hancheon catchment by tipping bucket system. In this study, hourly rainfall data of five gauge stations adjacent to the catchment (Fig. 1) namely, Jeju, Ara, Eorimok, Witsaeorum and Jindallaebat were used, which were processed by the automatic weather stations (AWS) of Jeju regional meteorological administration. The data record lengths vary between 11 to 50 years. The annual average rainfall near the coastal area was found as 1560 mm, while for the rest of the catchment area shows about 2061 mm rainfall (Jung et al. 2014). Fig. 1 shows the Hancheon catchment, Hancheon stream and the locations of the selected rainfall measurement stations of KMA. The Fig. 1 also shows the Digital Elevation Model (DEM) for the study area, which indicates northward sloping trend of the catchment. Table 1 shows the brief summary of the five selected rainfall stations and used datasets.

Table 1 List and type of the five rainfall stations’ utilized for analysis

Methods

L-moments: Theoretical background

The L-moments approach was firstly introduced by Hosking (1990) which is the suitable statistical modeling and facilitates the estimation process of probability distribution and frequency analysis. In recent years, rainfall extreme studies on statistical analysis are followed by method-of-moments estimator for annual maximum (viz. hourly, daily, monthly) time series, particularly in regional analysis. The L-moment provides a reasonable efficient estimation of hydrological data and distribution parameters. A few pragmatic advantages of using L-moment approach are better functioning with limited data samples, better dispersion, less biased skewness and kurtosis compared to other ordinary moments of probability distributions. Hosking (1990) characterized the L-moments based on probability weighted moments (PWMs) which can be shown as:

$$ {\lambda}_r=\frac{1}{r}{\sum}_{k=0}^{r-1}{\left(-1\right)}^k\left(\begin{array}{c}\hfill r-1\hfill \\ {}\hfill k\hfill \end{array}\right) E\left\{{X}_{r- k: r}\right\} $$
(1)

Here, λr is a linear function of r-th L-moment of a X distribution and r = 1, 2, 3, … is a non-negative integer. From eq. (1), the first four resulting L-moment can be written as:

$$ {\lambda}_1= EX $$
(2)
$$ {\lambda}_2=\left(\frac{1}{2}\right) E\left({X}_{2:2}-{X}_{1:2}\right) $$
(3)
$$ {\lambda}_3=\left(\frac{1}{3}\right) E\left({X}_{3:3}-2{X}_{2:3}+{X}_{1:3}\right) $$
(4)
$$ {\lambda}_4=\left(\frac{1}{4}\right) E\left({X}_{4:4}-3{X}_{3:4}+3{X}_{2:4}-{X}_{1:4}\right) $$
(5)

Hosking (1990) describes the utility of ratio estimator’s using-moment ratios in hydrological extreme analysis, such as:

$$ {\tau}_2= L-{C}_v=\raisebox{1ex}{${\lambda}_2$}\!\left/ \!\raisebox{-1ex}{${\lambda}_1$}\right. $$
(6)
$$ {\tau}_3= L- Skewness=\raisebox{1ex}{${\lambda}_3$}\!\left/ \!\raisebox{-1ex}{${\lambda}_2$}\right. $$
(7)
$$ {\tau}_4= L- Kurtosis=\kern0.5em \raisebox{1ex}{${\lambda}_4$}\!\left/ \!\raisebox{-1ex}{${\lambda}_2$}\right. $$
(8)

Where, τ2 is the measure of covariance (scale), τ3 is the measure of skewness (shape) with values ranging from 0 to 1, and τ4 is the measure of kurtosis (peakedness). Notable, these ratio estimator equations and their graphical diagrams are particularly good to identify the distributional properties of highly skewed data. Thus, following the above equations rainfall of 6-h, 12-h and 24-h L-moments ratio for each region has shown in this study.

Data screening by discordancy measure

A discordancy measure (Di), that is used to screen out the data from unusual sites to look for the appropriate datasets for regionalization. If a vector \( {\mathrm{u}}_{\mathrm{i}}={\left[{\mathrm{t}}^{\left(\mathrm{i}\right)},{\mathrm{t}}_3^{\left(\mathrm{i}\right)},{\mathrm{t}}_4^{\left(\mathrm{i}\right)}\right]}^{\mathrm{T}} \) which restrained the L-moment ratios for site i (Hosking and Wallis 1993), than the discordancy measure may be defined as:

$$ {D}_i=\frac{1}{3}{\left({u}_i-\overline{u}\right)}^T{S}^{-1}\left({u}_i-\overline{u}\right) $$
(9)

Where, ui = vector of L-CV, L-Skewness and L-Kurtosis; S is covariance matrix of ui and \( \overset{-}{\mathrm{u}} \) is the mean vector of ui.

Regional heterogeneity test

Homogeneous region identification is the significant step in regional frequency analysis. The statistics compare between the inter-site distributions of L-moment samples and can project a homogeneous region. Hosking and Wallis (1993) proposed that the derivation of statistical test for a homogeneous region is defined as heterogeneity measure (H). To determine the expected heterogeneity, the Monte Carlo simulation of rainfall having record lengths equal to that of the observed data were performed which is familiar in hydrological analysis. The heterogeneity measure (H) can be obtained as:

$$ H=\frac{V_{obs}-{\mu}_v}{\sigma_v} $$
(10)

Here, μv and σv are the mean and standard deviation of simulated data, respectively. Vobs is calculated from the regional data, which can be employed from three V-statistics (V1, V2, V3) as follows:

$$ {V}_1={\left[\sum_{i=1}^N{N}_i{\left\{{t}^{(i)}-{t}^R\right\}}^2/\sum_{i=1}^N{N}_i\right]}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.} $$
(11)
$$ {V}_2=\sum_{i=1}^N{N}_i{\left\{{\left({t}^{(i)}-{t}^R\right)}^2+{\left({t}_3^{(i)}-{t}_3^R\right)}^2\right\}}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.}/\sum_{i=1}^N{N}_i $$
(12)
$$ {V}_3=\sum_{i=1}^N{N}_i{\left\{{\left({t}_3^{(i)}-{t}_3^R\right)}^2+{\left({t}_4^{(i)}-{t}_4^R\right)}^2\right\}}^{\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.}/\sum_{i=1}^N{N}_i $$
(13)

For H statistics criterion, Hosking and Wallis (1993) suggested that the region is reasonably homogeneous if H < 1, possibly homogeneous region if 1 ≤ H < 2 and absolutely heterogeneous region if H ≥ 2.

Goodness-of-fit measure

The regional frequency distribution L-moment ratio diagrams and goodness-of-fit measure are chosen based on sample regional average and theoretical L-Kurtosis. For a particular distribution, the goodness-of-fit measure is calculated as follows:

$$ {Z}^{Dist}=\frac{t_4^R-{\tau}_4^{Dist}}{\sigma_4} $$
(14)

Here, \( {\mathrm{t}}_4^{\mathrm{R}} \) is an average L-Kurtosis value of the data from a given region, \( {\uptau}_4^{\mathrm{Dist}} \) is a theoretical L-Kurtosis value for a fitted distribution and σ4 is the standard deviation value that obtained from simulated data. For an approximate 90% confidence level, the acceptable goodness-of-fit is found at |ZDist| ≤ 1.64.

Estimation of regional rainfall quantiles

The frequency distribution procedure of maximum consecutive hour rainfall data in a homogeneous region consist of similar quantile distribution (Dalrymple 1960). In the simulations, quantile estimated for various robust probability distributions were calculated. If the quantile estimates consisted of regional growth curve Qm(F), i site’s non-exceedance probability F and site scaling factor, then the T-year quantile of the normalized regional distribution is computed by: Qi(F) = l1q(F); where q is common dimensionless function. For simulation of a homogeneous region, the regions had the same number of stations, data record length, heterogeneity and L-moments ratio as the observed data. During simulation, quantiles error, root mean square error (RMSE) and 90% error bounds were estimated from that assessment can be provided the accuracy level.

All kinds of statistical analysis and graphical representation for this study were done in R statistical program of 3.2.0 and MS excel 2007 version. The L-moment approach (lmomRFA 3.0–1 version) was also used in R package, developed by Hosking (2009).

Results and discussions

Rainfall availability

The substantial differences in elevation and geographical location have created considerable variation in daily rainfall patterns over different places of the study area. To investigate L-moments method using consecutive hour rainfall, the time periods was divided into 6-h, 12-h and 24-h. Rainfall analysis was carried out for each station and for each of the years separately. Among the five rainfall stations, the Witsaeorum station (near to the highest peak of the Jeju Island) received maximum daily rainfall of 1396.5 mm (Table 1) in a single calendar day since AWS began tracking rainfall data. The maximum 6-h, 12-h and 24-h rainfall data were also analyzed with the daily maximum rainfall values. Figure 2 illustrates the daily maximum and annual total rainfall in each station, indicating the temporal and spatial fluctuations. In comparison with the other rainfall years, highest number of extreme rainfall events occurred in 2012. The maximum daily rainfall for Jeju station was found as 615.6 mm, whereas the total annual rainfall was 2526.0 mm. Accordingly, for Ara, Eorimok, Witsaeorum and Jindallaebat rainfall stations the maximum daily rainfall values were 838.5 mm, 909.5 mm, 1396.5 mm and 1183.5 mm respectively; while the total annual rainfall values for the year of 2012 were obtained as 3461.5 mm, 4459.0 mm, 6514.5 mm and 7317.0 mm respectively. The potential reason could be the orographic rainfall effects with mountainous topography. Therefore, it is required to analyze the regional frequency with respect to elevation.

Fig. 2
figure 2

Temporal rainfall data availability for five stations

Stationary and independence test

The fundamental data execution was carried out using the Mann-Kendall test (Mann 1945; Kendall 1975) and auto-correlation function (ACF) analysis to verify the maximum 24 h consecutive rainfall which is convenient for regional frequency analysis. The results of Mann-Kendall trend test presented in Table 2 shows that all the rainfall stations’ values are constant over time, with no significant alteration trends. This reasonably infers that the datasets have a stationary series. Moreover, auto-correlation function (ACF) coefficient values are shown in Fig. 3(a) and (b) for ‘lag 1′ to ‘lag 13′ plotting. The ACF values show that each station’s rainfall values are below the critical bounds (ACF = 0.4), thus the maximum consecutive hour rainfall series can be considered as time-independent. Thereafter, spatial autoregressive calculation (Dong and Harris 2015) also showed that the stations’ cross correlations (probability, p-value) were not significant (up to 5%), for which the data can be considered as spatially independent.

Table 2 Summary of trend analysis of maximum hourly rainfall series using Mann-Kendall test
Fig. 3
figure 3

a Autocorrelation function (ACF) plot of Jeju, Ara and Eorimok station. b Autocorrelation function (ACF) plot of Jindallaebat and Witsaeorum station

Identification of homogeneous region by cluster based analysis

One of the initial steps of regional frequency analysis was the identification of homogeneous regions as per descriptions provided in methodology section. To identify such homogeneous regions on the basis of hydro-meteorological as well as geospatial similarities, a cluster based hierarchical dendogram (tree) analysis following Ward’s method (Ward 1963) has been used. The hierarchical clustered dendogram provided information on probable clusters and it was seen that the study area consists of three homogeneous regions. The appropriateness of this choice was also tested by heterogeneity measures (H). The three clustered hydrometric homogeneous regions such as the region 1 comprising Jeju and Ara station, region 2 comprising Eorimok station, and the region 3 comprising Jindallaebat and Witsaeorum station are illustrated in Figs. 4 and 5.

Fig. 4
figure 4

Dendogram of clustered stations by Ward’s method

Fig. 5
figure 5

Location of three homogeneous regions in Hancheon catchment

Region 1 (Jeju and Ara station) is situated in the urban portion of northern part of Jeju Island with an average elevation of 253 m, recording an average annual rainfall of around 1835 mm. Region 2 (Eorimok station) is located in the middle portion of Hancheon catchment, which is a semi urban area with an average elevation of 950 m, recording an average rainfall of 2436 mm. Region 3 (Witsaeorum and Jindallaebat station) is situated near Hallasan Mountain with an average elevation of 1570 m and recording an average rainfall of around 2361 mm. The rainfall characteristics of region 3 are fully influenced by tropical and mountainous winds.

Estimation of L-moments, homogeneity test and best fitted distribution

The L-moments approach, discordancy (D i ) and heterogeneity measure (H) of each region were executed by the lmomRFA package in R statistical programming. At first, the time-series scattered plots of annual L-CV, L-Skewness and L-Kurtosis ratios for each of the three regions have been developed (Fig. 6). For all the plots, L-moment ratio values were confined within 0.1 to 0.4. For region 2 and region 3, results indicated a parallel shift of values. This happened due to the abrupt daily and annual rainfall occurrences. In general, the L-moment ratios for the three regions showed inclined values over the sampling period.

Fig. 6
figure 6

Yearly variation of L-moments ratio for three regions

Secondly, the discordancy (Di) has been computed from eq. (9) and found as less than 3.0, suggesting that none of the regions is discordant (Table 3). Also, heterogeneity measures (H) were computed using eqs. (10) to (13) and using 500 simulated values in R programming for estimation. From the heterogeneity measure it was found that each H-statistics values were lower than 1.0 (H < 1.0), re-confirming that those regions were homogeneous. Afterwards, subsequent distributions have been applied for best fitted statistical distribution of each region. Consequently, the proposed ZDist measure (Hosking and Wallis 1993) was calculated by eq. (14). The best fitted distribution was inferred as ZDist as 0.54, 1.25 and 1.03 (which are below 1.64) for the region 1, 2 and 3 respectively, showing significant criterion to be accepted as goodness-of-fit at 90% confidence levels for individual homogeneous regions. The difference of results is understandable due to the hydro-geological distinctive conditions. Based on the discordance, heterogeneity measure and distribution value, it is suggested to make a decision on Hancheon catchment frequency distribution for each homogeneous region using Gumbel and generalized extreme value (GEV) distribution.

Table 3 Discordance, heterogeneity measure and best fitted distribution for three regions

Estimation of regional growth curves

The regional quantile estimates were found reliable as those were always obtained by regional frequency analysis. Robust estimation was needed when the regional distribution was more than one. In such cases, Monte Carlo simulation was used to estimate the root mean square error (RMSE) percentage at 90% confidence level. The estimation of q(F), for different non-exceedance probabilities have been shown in Table 4 and regional growth curves for three regions have also been represented in Fig. 7. The error bound of hourly maximum rainfall varied from 1.046 to 2.303 mm for region 1, 1.027 to 4.135 mm for region 2 and 0.960 to 7.829 mm for region 3. The higher variations of error bounds experienced in regions 2 and 3 may be because of the considerable spatial fluctuations of elevations due to their spatially undulating mountainous topography, which cause uncertainties in rainfall prediction. Furthermore, the RMSE values were found as 0.014 (5-yr) to 0.237 (100-yr) mm for Gumbel distributions (regions 1 and 2) and 0.115 (5-yr) to 0.301 (100-yr) mm for GEV distributions (region 3). The results indicate that the GEV method performs better in mountainous regions (region 3), whereas for urban (region 1) and semi-mountainous areas with transitional slopes (region 2) the Gumbel method provide better results.

Table 4 Simulation results of estimated regional quantiles, RMSE and corresponding 90% error bounds values
Fig. 7
figure 7

Estimated regional growth curve for three homogeneous regions

Regional quantile analysis

The present study is designed to derive rainfall patterns by L-moments based technique for all station of Hancheon catchment. The approach has been developed for 5, 10, 20, 50, 70, 80, 100 years return period (Table 5). Various periods of rainfall data had been used to estimate the return period. As a result, Jeju station area shows 165.12 to 333.97 mm rainfall, when the other station’s rainfall show remarkable interval of rainfall depth. Adjacent to the Hallasan Mountain (Jindallaebat station) the rainfall quantile ranges from a minimum of 183.46 mm to a maximum of 555.18 mm. For all stations’ return period values, a statistical measure (linear regression) was also done, which obtained r-square values within 0.842 to 0.974 and p-values below 0.001, indicating the statistical significance.

Table 5 Results of the consecutive hour (6-h, 12-h, 24-h) regional rainfall quantile for five station

The station analysis showed that the Eorimok station’s probable rainfall (163.23 mm to 311.03 mm) is lower than the other stations. This is because of the Eorimok station’s location within the transitional steep slopes between forest and hilly regions, which yields less rainfall accumulation due to varying elevations and slopes. Above all, the L-moments technique depicts accurate predictions of all kinds of statistical analysis and as such, the method can be suggested for policies and decision makings pertaining to hydrological catchment design.

Conclusions

The maximum consecutive hour rainfall data was analyzed using L-moments approach, to study the spatial homogeneity, probability distributions and as well as regional frequency estimates. The entailed careful data screening from historical rainfall events, carried out using cluster based dendogram analysis. From ward’s classification, three reasonably homogeneous regions were suggested for Hancheon catchment (Jeju and Ara in region 1, Eorimok in region 2 and Jindallaebat and Witsaeorum in region 3). After heterogeneity measure test, no limited discordant values were seen for the data sets. The L-moments ratio values varied within 0.1 to 0.4, which were considered as the statistical thresholds for the regional frequency analysis. The study concluded that Gumbel and generalized extreme value (GEV) distribution are more successful and reliable models for Hancheon catchment, which is marked by the relatively lower RMSE values (at 90% probability level). The analysis showed better rainfall predictions for region 1 (error bound between 1.046 to 2.303 mm); whereas for the region 2 (error bound between 1.027 to 4.135 mm) and region 3 (error bound between 0.960 to 7.829 mm) significant errors were found. Considering the spatial variations of hydro-meteorological and topographic characteristics, the rainfall estimates for different regions can be considered as useful hydrological design attributes. In spite of the statistical and design related findings researched thoroughly in this study a number of limitations still persist, leaving potentials for future identical research. The datasets used to develop the statistical analysis were limited, for which the scope of the study was confined within a definite statistical approach (L-moments). With more availability of hydro-meteorological data, more statistical as well as locally established mathematical tools could be taken into consideration which could further emphasize on the perfection of the technique. Furthermore, the study was carried out using five rainfall stations only. With more rainfall stations, relatively finer homogeneous clusters may be developed, which would further improve the local accuracies of rainfall estimates. However, given the size and locations of data availability, the study inferred good results. Moreover, the methodological framework used in the study is not only applicable for the Jeju Island but can also be implied in other similar areas where the rainfall data records are limited and the land slope is steep.