Estimation of short-duration rainfall intensity from daily rainfall values in Klang Valley, Malaysia

Data on intensity–duration–frequency or design rainfall are one of the most important information required for various hydrological and water resources studies. However, such crucial data are often unavailable in various parts of the world due to lack of enough rain gauging stations. It is not only tedious to determine design rainfall from the raw data but also occasionally impossible to calculate due to lack or absence of short-duration rainfall data. Generally, the manual rain gauges outnumber the automatic gauges, making it difficult to have adequate data on short-duration rainfall values, which is very important for urban hydrology. However, no graphical or mathematical relation could be found in the literature, which can be used for quick estimation of short-duration design rainfall from the daily rainfall data recorded by the manual stations. Annual maximum rainfall data from 143 rain gauging stations located at Klang Valley in Malaysia were used in this study. Statistical analyses and logarithmic graph fitting techniques were used to develop excellent correlation between short-duration rainfall and daily rainfall values for 96 automatic and 46 manual stations. Rainfall data analyze the design rainfall data of various duration and return periods. The 15, 30 and 45 min of short-duration rainfall, which is the most common rainfall duration in the study area, was observed to be 32.4%, 47.1% and 57.4% of the daily rainfall amount, respectively. The amount of rainfall during 1-, 2- and 3-h storm events contribute 64.9%, 76.5% and 80.9% of the daily rainfall. Such relations can be used for quick estimation of short-duration rainfall resulting in saving time, money and other resources.


Introduction
Rainfall data are very important for climate study, water resources evaluation, drainage design (Desa and Rakhecha 2004;Wang 1987), environmental studies and many other purposes. As such, most of the countries try to avail as many rainfall and other climate monitoring stations as possible. Continuous collection and archiving of various types of climatic data including the rainfall are of enormous value to study the climate change effects on the environment and society. It is reported that rainfall of the last century is the highest over the past 8500 years (Hong et al. 2014). In a warming environment, the increasing trend of extreme rainfall events is seen, in terms of either rainfall quantity or event frequency (Goswami et al. 2006, Allan andSoden 2008). It is also evident that in some parts of the tropics, seasonality of rainfall is also changing (Kumar 2013). At least, the extreme rainfall events in Peninsular Malaysia are reported to show increasing trend between the years 1975 and 2010 (Syafrina et al. 2015). As such, the need for analyzing the available rainfall data has become very much crucial for good planning for water resources development (Okonkwo and Mbajiorgu 2010). Quantification of rainfall is needed for proper planning and designing of various water resource projects (El-Sayed 2011). However, estimating short-duration rainfall intensities is a paramount need in many water-related projects such as urban drainage design (Haddad et al. 2011) and to mitigate the flash floods in the hilly areas. Doing over-design or under-design of hydraulic structures such as farm dams or culverts may yield a considerable waste of national resources or may compromise the safety of the structures (Reich 1961(Reich , 1963 and people living downstream of the hydraulic structures. According to Hamzah (2005), using at-site data to estimate design rainfall is the most common and reliable practice. However, he recommended regional estimation (Hosking and Wallis 2005) to calculate design rainfall which he described as more precise method of estimation. Desa and Rakhecha (2004) analyzed historical rainfall data  from 13 rain gauges in Selangor, Malaysia, and described the characteristics of short-duration extreme rainfall in the study area. The reported results can be helpful to design urban drainage system in Selangor. In this study, an attempt has been made to estimate short-duration rainfall intensity by correlating with the daily rainfall data of the Klang Valley. The specific objective of this paper was to develop equations and envelope curves to estimate short-duration (< 24 h; e.g., 15, 30, 45 min, 1, 2, 3, 4, 5, 6, 9, 12 and 18 h) rainfall intensity by using rainfall data of various storm durations. It was anticipated that the availability of such design charts and equations would enable quick estimation of any short-duration rainfall for planning and design purpose. Rashid et al. (2012) have developed short-duration rainfall intensity-duration frequency empirical equations for Sylhet City Corporation (SCC), Bangladesh. To estimate the shortduration rainfall intensity (SDRI) from daily rainfall data, they used an empirical reduction formula given by Indian Meteorological Department (IMD) targeting at estimating rainfall intensity of any return period with the least amount of effort. The equation is as follows:

Literature review
where P t is the required rainfall depth in mm at t-h duration, P 24 is the daily rainfall in mm and t is the duration of rainfall for which the rainfall depth is required in hr. Using the same equation, Chowdhury et al. (2007) developed the short-duration intensity duration frequency (IDF) curve for SCC with return period of 2, 5, 10, 50 and 100 years. And also Logah et al. (2013) used the same equation to estimate short-duration rainfall. However, they actually developed short-duration intensity duration frequency curve for the Accra city, capital of Ghana. Nguyen et al. (1998) proposed a method for estimating the distribution of short-duration (e.g., 1 h) extreme rainfalls at sites where data for the time interval of interest do not exist, but rainfall data for longer duration (e.g., 1 day) are available (partially gaged sites). The method was based on the scale invariance theory and the methodology was applied to extreme rainfall data from a network of (1) P t = P 24 t 24 (1∕3) 14 recording rain gauges in Quebec (Canada). They have shown that the rainfall estimate obtained by this method is comparable with those based on available at-site data. Garcia-Bartual and Schneider (2001) analyzed 408 rainfall events in Alicante (Spain) for the period of 1925-1992 and fitted nine frequently used empirical functions to estimate maximum expected short-duration rainfall intensities from extreme convective storms. They assessed the reliability of the functions and concluded that three parameter extreme value (such as generalized extreme value-GEV) distribution function provide satisfactory result. The generalized least squares regression (GLSR) model to estimate design rainfall for short durations was presented by Haddad et al. (2011), which was applied for Australian data. They have drawn conclusions that the equations based on GLSR gave satisfactory outcomes for both 6-min and 1-h durations and their assumptions for the model also met the criterion. Therefore, recommendation was made to use the GLSR-based model which can give more consistent estimates of short-duration rainfall intensity. Yu et al. (2004) developed regional IDF formulas for non-recording sites based on the scaling theory by analyzing data from forty-six recording rain gauges over northern Taiwan. Their proposed formulae resulted in reasonable verifications and simulations. Smithers and Schulze (2001) used a regional approach based on L-moments to give estimation of shortduration (≤ 24 h) design storms in South Africa. The method used was aiming at eliminating data limitation problems and also to improve reliability. They performed regionalization by using geographical and hydrometeorological features of the sites. The paper provided ways to estimate shortduration design storms at the locations where no rain gauge is installed in South Africa. They showed that for a 24-h storm, the mean of daily annual maximum series (AMS) can be more or less precisely estimated as a function of mean annual precipitation (MAP).
The magnitude of overall rainfall as well as its annual variability and seasonal distribution has been severely altered due to the change in the climate (Min et al. 2011;Easterling et al. 2000;Zeng et al. 1999). Cheng and Agha Kouchak (2014) showed that for a changing climate the conventional IDF curves are not sufficient to design an infrastructure. Their calculation demonstrated that stationary assumptions of IDF curves can underestimate the extreme rainfall events by 60%. However, they used Bayesian inference to estimate non-stationary IDF curves. Hamada et al. (2015) analyzed space-borne rainfall data of 11 years to conclude that there is little relationship between extreme convective events and extreme rainfall events.

Study area
The study area is about 8396 km 2 and is located along the west coast of the Peninsular Malaysia. The valley is very strategic and important as the country's main economic, business and commercial hubs are located within the Klang Valley. It lies within latitude of about 2 o 40′49″ to 3 o 50′28″ N and longitude of 102 o 09′31″ to 100 o 43′32″ E. East side of the study area is hilly (elevation varying from average value of about 80 m to as high as 600 m MSL). The average elevation at the main valley area varies within 20-35 m, MSL besides having patches of hilly areas within the valley itself. The terrain, eventually, slopes down to sea coast at the west side where the Straits of Malacca is located.
Similar to the other parts of the west coast of the Peninsula, the climate has more or less uniform temperature throughout the season (at night 24-27 °C and at the day time 33-38 °C), high humidity and heavy rainfall with two major monsoon seasons (southeast and northeast monsoon). The climate between the two monsoons is characterized by distinct transitional seasons that last about 3 months each. Although the east coastal areas of the Peninsula receives high rainfall during the months of November-January, most part of the Klang valley receives less rain during this northeast monsoon due to the blocking effect of the central mountain range running from north to south of the valley. The average annual rainfall of the project area may vary from 1800 to 2400 mm, which will be verified upon collection of the rainfall data. This area is also covered by satisfactory number of rain gauging network, which is important for such study.

Data availability
According to the Department of Irrigation and Drainage (DID) database in 2015, there are 236 rainfall stations in the State of Selangor and 54 rainfall stations within the Federal Territory of Kuala Lumpur. However, 132 and 8 stations were selected from two states, respectively, as only these stations passed the quality check criteria (Einfalt et al. 2008, Mudenda 2007and NOAA 2017. Three stations from the Malaysian Meteorological Department (MMD) were also included in this study, which made the total number of stations with acceptable rainfall data to be 143. There were 96 automatic and 47 manual stations used in this study. The data period varied depending on the installation date of the rain gauge and also the quality of useable data for acceptable analysis. The oldest data were from 1927, while a few stations started operation in 2004. For all rainfall stations, the latest data were collected till the year 2013, when the study was conducted. The final numbers of usable years of data were determined after checking the quality of the annual maximum rainfall data.

Data screening
The annual maximum rainfall data of 15-, 30-, 45-min, 1-, 2-, 3-, 4-, 5-, 6-, 9-, 12-and 18-h durations were retrieved for the automatic stations. However, the DID's data bank provided the annual maximum series (AMS) rainfall of 24-h duration (daily rainfall) for the manual stations. Smithers (1993) showed that for AMS analysis with sufficient data, if inconsistent data are removed it does not make any significant difference in results with those data included. However, to achieve greater accuracy and high reliability of the results, the data and its collection procedures were screened for quality control using the standard protocols and methods used by Einfalt et al. (2008), Mudenda (2007) and NOAA (2017).
Some of the examples are identification of nearly impossible values, detection of gaps in the data, sensitivity of measurement variability, procedure of internal and spatial consistency. How the data are collected, screened and archived by the relevant authorities in Malaysia was also given due to considerations. The data were screened through scatter plots, and outlier (95 percentile < data < 5 percentile) analysis was conducted to discard the doubtful data, which amounted to be very negligible; about 0.018% of the total data. The historical rainfall data of the Department of Irrigation and Drainage Malaysia (DID) and MMD are stored in the database managed by TIDEDA program developed by National Institute of Water and Atmospheric Research (NIWA 2017), New Zealand. The number of usable quality data for various durations of rainfall ranged between 9 and 86 years.

Statistical analyses
Quartile analysis was conducted to prepare the box plots, which describes the variation and distribution of data within the time period considered for the analysis. The plot shows the maximum, minimum and three quartiles (25, 50 and 75 percentiles). The mean or the median each has its own advantages and disadvantages. The median was preferred over the mean, as the former is not influenced by the exceptional AMS values.

Regionalization
A regional approach to rainfall frequency analysis attempts to supplement the limited information available from the relatively short periods of record with regional information from surrounding stations (Smithers and Schulze 2000). The reality is that in nearly all the practical situations a regional method is proved to be more efficient than the application of an at-site analysis (Potter, 1987). This opinion is also backed by both Lettenmaier (1985) who urged that "regionalisation is the most appropriate method of improving flood quantile estimation," and Hosking and Wallis (2005) who, after a review of the recent literature, advocated the use of regional frequency analysis based on the belief that a "well conducted regional frequency analysis will yield quantile estimates accurate enough to be useful in many realistic applications." Rainfall data of various durations were plotted on the map to study the spatial distribution of the mean annual rainfall values and their ratios to the daily rainfall values. Data of the selected stations were plotted (using Grapher software) to obtain isohyetal maps to find out the similarity of rainfall pattern within the study area.

Short-duration rainfall design charts
Relation of short-duration rainfall with the daily rainfall value was determined and plotted to develop regression model. Depending on the nature of curvature and in order to get reliable estimates, two equations were proposed: one equation for the rainfall duration up to 2 h and the other one for duration more than 2 h but less than 24 h (1 day). There are possibilities of having slightly different estimated rainfall values for the transition duration which is 2 h, in this particular study. As such, rainfall of any duration close to 2 h should be calculated using both equations and the higher value should be accepted for the design rainfall value. Short-duration rainfall data of the nearby automatic station were considered for testing the output design graphs.

Evaluation of available data
Sufficient numbers of stations are available in the Klang Valley, and they are satisfactorily distributed within the study area. However, the scattered plots of the rainfall data indicated discrepancies in among a few number of stations. This could be due to faulty gauging stations or improper processing of raw data. Specially, data from the stations installed after the year 2000 could not be issued due to exceptionally high and low (both) values, which can be attributed to the incomplete calibration of the instruments until the time the data were collected. Nevertheless, as stated in the previous section, the outliers were identified and eliminated from the raw data to reduce the noise in the statistical analysis (Cunnane 1989).

Statistical results
Statistical summary of rainfall data from the selected automatic station in Klang Valley is given in Table 1. The highest daily maximum rainfall value was recorded at 156.4 mm and the lowest was 77.0 mm. The difference between the highest and lowest daily rainfall value was almost double.
Box plots are very efficient ways to show the statistical distribution of data, indicating the maximum, 75 percentile, median, 25 percentile and minimum values, from the top of the graph, respectively. A small box with narrow gap between high and low values indicates good consistency and uniformity among the data. Box plot of the mean annual maximum values for the various durations of all rainfall stations in Klang valley is shown in Fig. 1.
The overall patterns of the percentile values look similar. However, the lower rainfall magnitude (closer to the 25 percentile) is not as variable as that of the higher rainfall values. Also, the 75-25 percentiles for shorter durations look to have more uniform pattern with small inter-quartile range than the longer durations (larger range between 75 and 25 percentile values). The 25 percentile rainfall amounts for the storm durations less than 2 h were close to the median compared to the difference between the 75 percentile rainfall value and the median rainfall. However, for the storm durations longer than 4 h, the median rainfall value tends to shift toward the 75 percentile value, except for the daily rainfall value for which case the median value again moves closer to the 25 percentile value (Fig. 1). Interestingly, there is not much variability of median rainfall for 4-9-h durations, which could be due the fact that most of the shortduration heavy rainfall is due to single convective storm event, whereas the long-duration rainfalls might be due to multiple thunderstorms or due to several bursts of rainfall during the monsoon seasons.

Regional rainfall relation
The mean annual maximum rainfall of various durations was plotted using Grapher to study the relation with respect to storm duration. However, no distinct pattern could be identified based on the spatial data, which could lead to useful regionalization. As such, envelop curves of the rainfall coefficients (ratio to daily rainfall value) are proposed for the whole Klang Valley, instead of dividing the area into several irregular regions. Envelop curves of the best fit relationships for upper envelop (90 percentile), median envelop (50 percentile) and lower envelop (10 percentile) were developed for the ratio of short-duration rainfall to daily (24 h) rainfall values for the Klang Valley and the areas within Selangor and Federal Territories (Kuala Lumpur and Putrajaya). Figure 2 shows that the mean annual maximum daily rainfall amount was the highest in the city centers of the Federal Territory of Kuala Lumpur and Kuala Kubu Baru areas (about 120 mm). For the other neighboring areas of these city centers and towns, the mean annual maximum daily rainfall was about 110 mm. The areas along the middle reaches of the west coast (Fig. 2) experienced mean annual maximum daily rainfall of about 100 mm. Ratios of the mean annual maximum short-duration rainfall values to the mean of the annual maximum daily rainfall of each station were calculated (Table 2) to formulate the information (Table 3, Eq. 2-7 and Fig. 5) required to calculate the short-duration design rainfall value of any desired duration less than 24 h. Figure 3 shows that difference between the median SDRC values and maximum SDRC values was high for storm durations up to 2 h. These differences were less for the storm durations longer than 2 h. Such relation was opposite in the case of the median and lowest SDRC values for the study area. Isohyetal maps of the hourly rainfall are shown in Fig. 4, which shows the spatial distribution of the ratios of 1-h rainfall to the daily (24 h) rainfall value.
No visual distinct pattern for the spatial distribution of the rainfall ratios could be observed for Klang Valley. As such, instead of breaking into several regions, three envelope curves are proposed to estimate the short-duration rainfall values when the historical daily rainfall values are available in the Klang Valley.

Short-duration rainfall design charts
Design charts for short-duration rainfall coefficients (SDRC), which are basically ratio to the mean annual maximum daily rainfall values, for the Klang Valley are developed as shown in Fig. 5. The SDRC values are also given in Table 3, which can be multiplied with the mean annual maximum daily rainfall values of any rainfall station to estimate the mean annual maximum short-duration rainfall values. Table 3 or the design charts and equations in Fig. 5 can be used to estimate the short-duration rainfall of any duration less than 24 h, by using the following equations:

Short-duration rainfall estimation equations
(2) SDRC U1 = 0.2355 Ln(t) + 0.7165 t < 2 h Two sets of equations are given for each envelop curve. One equation (with subscript 1 ) is for the storm duration less than 2 h and the other (with subscript 2 ) is for equal to and more than 2 h. Now, the mean annual maximum short-duration rainfall (SDR) can be estimated by the following equation: where MAMDR is the mean annual maximum daily rainfall (mm). Although the short-duration rainfall estimation equations and charts (presented in this report) were developed based on the MAMDR records, the procedure can be used for "quick estimation" of any short-duration rainfall. For instance, if the "daily rainfall" of 5-year return period (or any other return period) is known for any location, this procedure can be

Conclusions
Short-duration rainfall is critical for the small catchments and urban drainage systems. There is always a shortage of short-duration rainfall data as it requires automatic rain gauges to record such data. On the contrary, daily rainfall values are generally available due to the use of cheap manual instruments. The authors could not find any reliable methods to estimate short-duration rainfall data based on the daily rainfall values. As such, long-term data of the rainfall stations in Klang Valley area were used to develop very good statistically acceptable (R 2 > 0.90) relation between shortduration rainfall and daily rainfall data. Very unique trend was observed which might be used to estimate short-duration rainfall based on the daily rainfall data of the manual rain gauging stations. Use of such relations may help the relevant authorities overcome the demands of installing more rain gauge stations in the country, including the expensive automatic data logging rainfall measuring instruments.