Introduction

The rainfall intensity–duration–frequency (IDF) curves are graphical representations of the probability that a given average rainfall intensity will occur within a given period of time (Dupont and Allen 2000). Providing mathematical relationship between the rainfall intensity \(i\), the duration \(d\), and the return period \(T\) (or equivalent to the annual frequency of exceedance \(f\)), the IDF curves allow for the estimation of the return period of an observed rainfall event or conversely of the rainfall intensity corresponding to a given return period (Elsebaie 2012). Design storms derived from IDF curves are commonly adopted in water resources engineering for designing of urban drainage systems, evaluating the endurance of hydraulic structures, and assessing regional flood vulnerabilities (Keifer and Chu 1957).

The first IDF curve was established as early as 1932, whilst since then many sets of IDF relationships have been constructed for several parts of the world (Chow 1988; Gellens 2002; Grimaldi et al. 2011). As presented in Fig. 1, the typical steps to derive the IDF curves are as follows (Koutsoyiannis et al. 1998; Nhat et al. 2006):

Fig. 1
figure 1

Schematic diagram of deriving IDF curves

  1. 1.

    Retrieve the extreme rainfall intensities for a specific duration through annual maximum analysis;

  2. 2.

    Fit the extreme rainfall intensity time series, for each duration, to a theoretical distribution function, e.g. Generalised Extreme Value (GEV), Gumbel, Pearson III;

  3. 3.

    Calculate the rainfall intensity, for each duration and return period, based on the selected distribution function; and

  4. 4.

    Construct the IDF curves following the empirical formulae, e.g. Talbot, Bernard, Kimijima, Sherman, through regression techniques.

The deriving procedure of IDF curves involves utilization of long-term historical rainfall observations. When fine-timescale rainfall records (e.g. sub-daily) are not available, the characteristics of extreme rainfall intensities and subsequently their distribution functions corresponding to the short durations might not be captured. This missing information will, in step (4), result in regression errors which are even more pronounced at short durations.

Fine-timescale rainfall observation, however, is often a luxury for many regions due to the high cost, confidentiality, time-consuming procedures involved in data acquisition and sharing, etc. As proxies to the in situ rainfall measurement, satellite or radar-based precipitation products have been widely used to derive the IDF curves in case of data scarcity. Ombadi et al. (2018) developed a methodological framework to adjust the bias and transform the satellite areal rainfall to point rainfall, and applied the products to develop IDF curves in ungauged regions. Marra et al. (2017) compared IDF curves from radar and satellite estimates over the eastern Mediterranean and quantified the uncertainty related to their limited record on varying climates. Awadallah and Awadallah (2012) used Tropical Rainfall Measuring Mission (TRMM) satellite data to derive the relations between maximum sub-daily, daily and monthly rainfall and combined the relations with the coarse rainfall measurements to develop IDF curves for a scarce region in Africa. However, above studies either depend merely on the satellite or radar data, without utilizing the local rainfall measurement, e.g. Ombadi et al. (2018) and Marra et al. (2017), or the methodology is a simple scaling approach and lacking of theoretical background, e.g. Awadallah and Awadallah (2012).

Physically based and stochastic rainfall disaggregation models have emerged as popular alternative approaches to produce finer-scale precipitation based on coarser-scale information (Mason 1986; Wilby et al. 2002). Physically based models are known for the cascade of uncertainty due to many uncertain processes (hence complex sets of differential equations) and required variety of atmospheric variables, whereas in contrast the stochastic rainfall disaggregation methods demand low computational power and minimum model inputs. The latter are therefore selected in the present study and require only 2 datasets in the study case, i.e. daily rainfall observation and publicly available GSMaP satellite rainfall.

Several stochastic rainfall disaggregation models have been developed, such as (1) the random cascade models based on scale-invariance theory (Gupta and Waymire 1993; Carsteanu and Foufoula-Georgiou 1996; Molnar and Burlando 2005), and (2) the Bartlett-Lewis/Neyman-Scott rectangular pulses models based on point process theory (Onof and Wheater 1993; Khaliq and Cunnane 1996; Koutsoyiannis and Mamassis 2001; Gyasi-Agyei and Mahbub 2007). Studies have been conducted to assess the performance of different disaggregation methods towards their suitability and effectiveness (Ferraris et al. 2003; Serinaldi 2010; Licznar et al. 2011). A general conclusion as for the best model, however, cannot be drawn as rainfall properties significantly differ for different climatic regions with different generating mechanisms (Sharma and Mehrotra 2010). In view of its wide applicability in various climatic conditions, Bartlett-Lewis rectangular pulses (BLRP) model, in particular HyetosR, is selected in this study as the rainfall disaggregation tool. HyetosR is an R package developed to disaggregate daily rainfall into hourly time series, which combines a rainfall simulation model based on the BLRP process with a proportional adjusting procedure to rescale the hourly totals to the required daily values (Koutsoyiannis and Onof 2000, 2001; Kossieris et al. 2012). Hourly rainfall observations are not necessary as a direct input for HyetosR; however, sub-daily rainfall statistics is still required for the parameter estimation of the BLRP model.

As an alternative source for sub-daily rainfall statistics, the remote sensing rainfall from Global Satellite Mapping of Precipitation (GSMaP) is acquired and used in this study. The GSMaP project was initiated for a study “Production of a high-precision, high-resolution global precipitation map using satellite data”, promoted by Japan Aerospace Exploration Agency (JAXA) Precipitation Measuring Mission (PMM) Science Team (Okamoto et al. 2005; Kubota et al. 2007). GSMaP offers near-real-time hourly global rainfall maps with a resolution of 0.1 degree from January 1998 to November 2010. However, capturing rainfall extremes by satellite products has been recognized as an open issue (Endreny and Imbeah 2009; AghaKouchak et al. 2011; Gourley et al. 2011; Stampoulis et al. 2013); GSMaP rainfall, with no exception, underestimates rainfall intensity in general (Dinku et al. 2010; Tian et al. 2010). This limitation, and together with its short period (< 13 years), hinders the direct application of GSMaP rainfall in deriving IDF curves, especially affects the accuracy corresponding to longer return periods.

The present study attempts to make the best of daily rainfall observations and sub-daily satellite-based rainfall to derive IDF curves that are able to capture the frequency characteristics of short-duration rainfall extremes. The utilization of BLRP model is explored to disaggregate daily observations into hourly by using satellite-based rainfall characteristics, i.e. derived from GSMaP statistics. The GSMaP rainfall statistics are used to guide the parameter optimization of the BLRP model, which is then applied to downscale the daily rainfall observations into hourly rainfall. The proportional adjusting procedure in the BLRP model adjusts the daily cumulated sums of the disaggregated rainfall to the magnitude of daily rainfall observations and hence overcomes the GSMaP’s shortcoming of underestimating reality. The downscaled hourly rainfall from BLRP is then extracted for its annual maxima (AMAX) through extreme value analysis and fed as inputs to derive the intended higher resolution IDF curves. An ensemble of 100 BLRP simulations is conducted to reduce the uncertainties in the random process of the BLRP model. The resulted IDF curves are evaluated against the IDF curves constructed from real hourly observations and the ones constructed from just daily rainfall observations to assess if the proposed approach can indeed reproduce the frequency characteristics of sub-daily rainfall extremes. The methodology, study case, and conclusions are elaborated in following sections.

Methodology

Figure 2 illustrates the proposed IDF deriving scheme, which is conducted in 4 steps.

Fig. 2
figure 2

Flow diagram of the proposed downscaled rainfall IDF derivation

Step 1 Calculate rainfall statistics

Reanalysis version of GSMaP hourly rainfall (GSMaP_MVK Ver.5.222) is downloaded from the database of JAXA (ftp://rainmap:amechi-zu@hokusai.eorc.jaxa.jp/).

Rainfall statistics is calculated upon extracting the rainfall time series at the GSMaP grid cell nearest to the rainfall measurement station. The statistics includes mean \(E_{G}\), variance \({\text{var}}_{G}\), auto-covariance \({\text{acov}}_{G}\) and probability of dry \(P_{G}\), where the subscript \(G\) indicates the rainfall statistics calculated from the GSMaP rainfall.

Step 2 Estimate BLRP model parameters

The Bartlett-Lewis rectangular pulses (BLRP) model is essentially a random-parameter rainfall generator based upon a Poisson cluster process (Rodriguez-Iturbe et al. 1987, 1988). This paper uses the most enriched version of the BLRP model with seven parameters, i.e. \(\lambda\), \(\kappa\), \(\phi\), \(\alpha\), \(\nu\), \(\mu_{X}\), and \(\sigma_{X}\) (Onof and Wheater 1994). The description of the BLRP model and the physical meaning of the seven parameters are detailed in Appendix A.

The model parameters can be estimated by minimizing an objective function defined as

$${\text{obj}} = \mathop \sum \limits_{h} \left( {w_{1} \left| {\frac{{E - E_{G} }}{{E_{G} }}} \right| + w_{2} \left| {\frac{{{\text{var}} - {\text{var}}_{G} }}{{{\text{var}}_{G} }}} \right| + w_{3} \left| {\frac{{{\text{acov}} - {\text{acov}}_{G} }}{{{\text{acov}}_{G} }}} \right| + w_{4} \left| {\frac{{P - P_{G} }}{{P_{G} }}} \right|} \right)$$
(1)

where mean \(E\), variance \({\text{var}}\), auto-covariance \({\text{acov}}\), and probability of dry \(P\) (defined as ”ratio of the number of dry recordings to the total number of recorded data”) are the modelled rainfall statistics which are functions of the aforementioned seven parameters (refer to Appendix A), \(w_{1}\) to \(w_{4}\) represent the weights applied to each absolute relative errors, and \(h\) is the timescale. Five timescales are selected herein, i.e. 1 h, 2 h, 6 h, 12 h, and 24 h. Equation (1) implies the necessity of presenting rainfall statistics at various timescales in estimating the parameters for the BLRP model.

The seven parameters for the BLRP model are in this paper simultaneously optimized in order to minimize the objective function, i.e. Eq. (1), using the genetic algorithm (GA) which solves the optimization problems based on the mechanics of natural genetics (Goldberg 1989).

Step 3 Disaggregate daily rainfall observations

The BLRP models are run separately for each cluster of wet days, i.e. a series of consecutive wet days delimited by at least one dry day. Several runs are conducted for each cluster, until the departures of the daily sums become lower than an acceptable limit, which is herein defined as 0.1 mm as a trade-off between the model accuracy and program running time. The departure \(\delta\) is defined as

$$\delta = \left[ {\mathop \sum \limits_{k = 1}^{L} \ln^{2} \left( {\frac{{Z_{k} + c}}{{\tilde{Z}_{k} + c}}} \right)} \right]^{{\frac{1}{2}}}$$
(2)

where \(L\). is the number of wet days in the cluster, \(c = 0.1\) mm is a small constant to avoid zero denominator, \(Z_{k}\) and \(\tilde{Z}_{k}\) are, respectively, the observed and simulated daily rainfall depths on day \(k\).

The cluster is further processed to scale the hourly rainfall depths through the proportional adjusting procedure according to

$$X_{s} = \tilde{X}_{s} \left( {\frac{Z}{{\mathop \sum \nolimits_{s = 1}^{24} \tilde{X}_{s} }}} \right)$$
(3)

where \(\tilde{X}_{s}\) is the initially generated hourly intensity, \(Z\) is the observed daily rainfall depth, and \(X_{s}\) is the adjusted hourly intensity. This rescaling process enforces the sum of disaggregated rainfall consistent with the daily observations and therefore corrects the underestimation errors in the GSMaP rainfall. As a result, the daily and coarser-scale (e.g. bi-daily) extremes from the disaggregated rainfall are identical as the extremes from the observed data.

Step 4 Derive IDF curves

The annual maxima (AMAX) of the disaggregated hourly rainfall is then extracted and fed as the input to derive the IDF curves following the typical procedure, as described in Introduction. GEV distribution function is adopted in the study case based on the standard practice (Public Utilities Board Singapore 2012); the Sherman equation is selected to regress the IDF curves, i.e.

$$i = \frac{a}{{\left( {d + b} \right)^{e} }}$$
(4)

where \(i\) is the rainfall intensity, \(d\) is the duration, \(a\), \(b\), and \(e\) are the regression parameters determined by the least square method. Different \(a\), \(b\), and \(e\) values are regressed for different return periods \(T\).

Study case

The proposed IDF deriving scheme is applied to the east of Singapore. Singapore is an island country located near the equator featuring a tropical rainforest climate. The area of Singapore is about 700 km2, and the elevation ranges from 0 to 164 m above mean sea level (MSL) with relatively mild slopes. In Singapore, convective thunderstorm prevails and short-duration precipitation dominates; the design storms are therefore often derived with short durations (below 4 h; Public Utilities Board Singapore 2012), which makes the accuracy of IDF curves at sub-daily scales substantially important. In spite of a wetter monsoon season from November to January, the annual rainfall extremes can actually occur in any month of a year. In order to capture all possible extremes, this study applies rainfall disaggregation for Singapore on the basis of all calendar months.

Figure 3 shows the location of Changi Met Station and the distribution of the GSMaP grid points. The Changi Met Station, located in the east of Singapore with a flat topography, uses a tipping bucket rain gauge. Hourly rainfall observations are available at the Changi Met Station for 50 years from 01 January 1966 to 31 December 2015. The GSMaP rainfall is extracted from the nearest grid point, i.e. 4 km from the Changi Met Station at (103.95° E, 1.35° N), as highlighted in Fig. 3.

Fig. 3
figure 3

Location of Changi Met Station and GSMaP grid points

Table 1 illustrates the rainfall statistics from November to January, calculated from both GSMaP rainfall and rainfall observations at five timescales. Figure 4 presents the scatter plots of the tabulated rainfall statistics. The statistics of the GSMaP rainfall is highly correlated with the rainfall observation statistics, and the discrepancies between the rainfall statistics are in general insignificant, with an exception at variance where GSMaP drastically underestimates the observations. In order to account for the less representative variance, a smaller weight is applied to define the objective function in Eq. (1), i.e. \(w_{1} = w_{3} = w_{4} = 1\) whilst \(w_{2} = 0.1\). The values are determined by the method of trial and error, which are validated to inflict insignificant influence to the disaggregation results. The general resemblance also confirms the applicability of estimating parameters for the BLRP model using the statistics calculated from GSMaP rainfall.

Table 1 Comparison of GSMaP and observation (Obs) rainfall statistics at Changi from November to January
Fig. 4
figure 4

Scatter plots of GSMaP and Obs rainfall statistics

Table 2 summarizes the optimized parameters of the BLRP model. The parameters appear to be different for different months due to their distinct rainfall statistics. A total of 100 simulations are carried out with the same set of optimized parameters to account for and assess the uncertainties in the random process of the BLRP model. The boxplots of differences between the disaggregated and observed mean rainfall extremes are presented in Fig. 5. For all sub-daily durations, the median differences are slightly below zero, indicating the underestimation of extremes in the disaggregated rainfall, whereas the magnitude of the quantile boxes (75th percentile–25th percentile) varies from 5 to 15 mm/h as the duration decreases, implying the uncertainty generally increases as the duration decreases; for the daily and bi-daily durations, the disaggregated rainfall extremes are identical with the observed extremes.

Table 2 Summary of the optimized Bartlett-Lewis model parameters at Changi from November to January
Fig. 5
figure 5

Differences between the disaggregated and observed mean rainfall extremes (centre mark: median; lower edge: 25th percentiles; upper edge: 75th percentiles)

Figure 6 presents the IDF curves with 10-year, 20-year, 50-year, and 100-year return periods derived from different rainfall data. IDF A is derived from hourly rainfall observations, IDF B is derived from daily rainfall observations, whereas the dotted, dashed and solid lines in IDF C are, respectively, derived from the 25th, 75th percentiles and median rainfall extremes of the disaggregated hourly rainfall. IDF A, derived with the most thorough and accurate rainfall information, is used as the benchmark for comparison. Due to the missing information of sub-daily rainfall extremes, IDF B drastically underestimates the rainfall intensity, especially at short durations. As shown in Fig. 5, the medians of disaggregated rainfall extremes are lower than the observed extreme values. This results in the underestimation of the IDF curves derived from the median rainfall extremes of the disaggregated hourly rainfall. Nonetheless, the errors are corrected to a great extent in the median IDF C which has a much better agreement with IDF A and an overall improved accuracy. The improvement is even more prominent at shorter durations. Figure 6 also shows that the uncertainties in the derived IDF C are effectively reduced by taking the median extreme values of the ensemble of 100 BLRP simulations.

Fig. 6
figure 6

IDF curves derived from different rainfall time series (A—from hourly observation, B—from daily observation, C—from hourly disaggregated rainfall)

Table 3 summarizes the root mean square error (RMSE) of IDF B and median IDF C, as assessed with IDF A, calculated by

$${\text{RMSE}} = \sqrt {\frac{1}{N}\mathop \sum \limits_{j = 1}^{N} \left( {i_{j} - i_{j}^{'} } \right)^{2} }$$
(5)

where \(N = 7\), \(j = 1,2, \ldots ,7\), \(i_{j}\) and \(i_{j}^{'}\) are, respectively, the rainfall intensities from IDF A and IDF B or median IDF C corresponding to the 7 durations, i.e. 10, 30, 60, 120, 360, 720 and 1440 min. As shown in Table 3, the RMSE is significantly reduced in the median IDF C, and the reduction rate increases as the return period increases from 10 years to 100 years with an average reduction of over 70%.

Table 3 RMSE of IDF B and median IDF C as assessed with IDF A

Conclusions

This paper proposes a novel rainfall IDF curve deriving scheme, which combines rainfall in situ observations and remote sensing data (GSMaP) through stochastic downscaling and successfully utilizes the downscaled or disaggregated sub-daily rainfall to derive the IDF curves for Singapore. The statistics calculated from GSMaP data are utilized to optimize the parameters in the Bartlett-Lewis rectangular pulses (BLRP) model. The optimized BLRP model is then applied to disaggregate the daily rainfall observations into hourly time series. The disaggregated hourly rainfall, preserving both the hourly and daily statistic characteristics, is then extracted for its annual maxima (AMAX) and used to derive the IDF curves.

The proportional adjusting procedure in the BLRP model regulates the disaggregated hourly rainfall to preserve the daily statistics as in the daily observations. The daily and coarser-scale (e.g. bi-daily) extremes from the disaggregated rainfall are identical as the extremes from the observed data. This rescaling process in BLRP overcomes the GSMaP’s shortcoming of underestimating the overall magnitudes of rainfall. Combining the sub-daily rainfall extremes from the disaggregated rainfall with the daily and coarser-scale (e.g. bi-daily) rainfall extremes from the observations is proved to result in a more reliable IDF derivation, especially corresponding to the sub-daily durations.

The seven parameters of the BLRP models are simultaneously optimized based on the genetic algorithm (GA). An ensemble of 100 BLRP simulations is conducted to account for the uncertainties in the stochastic disaggregation process. The overall accuracy of the IDF curves is significantly improved after considering the sub-daily extremes retrieved from the disaggregated hourly rainfall. The improvement is even more prominent at short durations which are of more importance for regions where convective thunderstorm prevails and short-duration precipitation dominates. On average, over 70% of the RMSE is reduced at Singapore.

This study implies a promising solution to high temporal resolution data scarcity that often impedes the accuracy of IDF curves for more reliable engineering design related to shorter durations. However, caution should also be taken as the study case is located at relatively flat area (east of Singapore) where impact of topography on spatial variability is assumed to be negligible. Furthermore, the IDF deriving scheme is applied and validated using rainfall data from one station only. To consider more stations and to further validate the method with a variety of climatic features are of interest for future study.