Assessment of the performance of satellite rainfall products over Makkah watershed using a physically based hydrologic model

Makkah region is one of the most flash flood-prone areas of Saudi Arabia due to terrain characteristics and the synoptic-scale weather conditions that intensify through interaction with the local topography causing high convective short-lived rainfall events, although these conditions are quite infrequent. Most of these events last for less than two hours. This study aims to assess the performance of five satellite precipitation products over a 1725 km2 sparsely gauged, arid basin. A fully distributed, physically based hydrologic model was forced by the five satellite precipitation products, and the evaluation included the hydrographs and runoff maps predicted by the model. Moreover, the propagation of the satellite rainfall errors into runoff predictions was quantified. Large variations and significant biases were found in satellites precipitation estimates compared to the available ground rainfall measurements. The Early IMERG product showed the best agreement with the reported total rainfall accumulations followed by Late IMERG while the other products significantly underestimated precipitation accumulations. Comparison with estimated runoff peaks showed that the Early IMERG product has the lowest errors in runoff peaks. Therefore, the hydrographs produced by the Early IMERG product were used as a reference to quantify the propagation of satellite precipitation errors into runoff predictions over the Makkah watershed. The results clearly indicated that both systematic and random rainfall errors were significantly amplified in runoff predictions.


Introduction
The temporal and spatial variability associated with extreme hourly or daily rainfall events and its interactions with the spatial distribution of the watershed physiography and drainage network determine the watershed response. These complexities make it very difficult to adequately simulate and predict flooding details, especially in urban areas Morin et al. 2006;Borga et al. 2007;Norbiato et al. 2007;Meierdiercks et al. 2010;Mejía and Moglen 2010;Wright et al. 2012Wright et al. , 2013Nixon et al. 2013;Yang et al. 2013). The use of high-resolution precipitation products and physically based, fully distributed hydrological models representing the complexities of the watershed's spatial heterogeneity is the most viable approach currently available to address this challenge. Precipitation is the key input in rainfall-runoff models, and the simulated hydrologic processes are directly impacted by the spatial rainfall distribution (Dai 2006;Schuurmans et al. 2007;New et al. 2001;Chintalapudi et al. 2012). Before the late 1990s, precipitation was primarily measured with rain gauges and weather radar networks (Krajewski and Smith 2002). Unlike the ground-based gauge networks, near real-time information can be provided by weather radars at fine spatial and temporal resolutions over a continuous region. Many studies validated radar measurement and demonstrated its advantages over gauge observations (e.g., Habib et al. 2009;Wang et al. 2008). Nevertheless, the significant gaps in radar coverage over complex terrains due to the global lack of radar network distribution and beam blockage limit the widespread use of radar precipitation technology (Maddox et al. 2002).
The recent availability of satellite precipitation products at a global scale with increasing spatiotemporal resolutions supports the potential of their use as input to flood simulation and forecasting models, especially in ungauged watersheds (Sharif et al. 2017). However, several validation studies from across the Globe demonstrated the uncertainty of satellite precipitation estimates, especially for very light and extreme precipitation events of high magnitude (Furl et al. 2018). For instance, Nikolopoulos et al. (2010) showed that satellites consistently underestimated mean areal precipitation of several high magnitude events in Italy. Mei et al. (2014) demonstrated that satellite rainfall products were not overly biased for short-duration events than for long-lived frontal events but showed a wider range of errors for shortduration events. Moreover, high inconsistency was shown by satellite products for different climatic conditions (Thiemig et al. 2012) and across different types of terrain (Hirpa et al. 2010). These results highlight the need for more assessment and analysis of satellite rainfall products.
The satellite-based precipitation products such as Global Precipitation Measurement (GPM) (Kubota et al. 2007), CPC MORPHing technique (CMORPH) (Joyce et al. 2004), the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) (Sorooshian et al. 2000) and the Tropical Rainfall Measuring Mission (TRMM) (Huffman et al. 2007) products are the most commonly used.
Several studies demonstrated that the difference between the satellite precipitation products' performance depends on the region's hydro-climatic characteristics (Barrett 1993;Yilmaz et al. 2005). Liu et al. (2015) evaluated satellite precipitation products over a large watershed in China. They reported that TRMM 3B42 performed well on annual and monthly scales and CMORPH at the daily scale, while PER-SIANN had an inferior performance at all time scales. However, Tan et al. (2015) reported that TRMM outperformed CMORPH and PERSIANN among six satellite precipitation products assessed over Malaysia. In another study, the CMORPH product was found to be less accurate than other satellite precipitation products over Indonesia (Vernimmen et al. 2011). Moreover, the earlier version of the CMORPH has an insignificant correlation with rain gauges over the Urmia Basin in Iran and the United Arab Emirates (UAE) (Ghajarnia et al. 2015;Wehbe et al. 2017). Other studies found CMORPH and PERSIANN products to be spatially inconsistent, while TRMM and GPM were reported to be more accurate and relatively consistent but generally underestimated the heavy storm events and overestimated the average rainfall events (Omranian and Sharif 2018;Mantas et al. 2015;Thiemig et al. 2012).
The resolution of satellite precipitation products has increased after the launch of the GPM in April 2014. The Integrated Multi-SatellitE Retrievals for the GPM (IMERG) precipitation products are available at 0.1° × 0.1° spatial resolution every 30 min Kidd et al. 2020). Krishna et al. (2017) concluded that IMERG products outperformed the TRMM-3B42 product at different time scales over the Indian subcontinent. Verma and Ghosh (2018) results revealed that the IMERG Late and Final products showed a better agreement with field data compared to the IMERG Early product over Gangotri glacier in India. Omranian and Sharif (2018) reported a good performance by the three IMERG products over the Lower Colorado River Basin of Texas and concluded that the products can be used in flood forecastings and water resources management. Wu et al. (2019) assessed the IMERG and TRMM-3B42V7 products over the Yangtze River Basin, China, and found that the IMERG products were more skillful than TRMM-3B42V7 product in detecting light precipitation. Yuan et al. (2019) found that that the performance of the IMERG Final product over a poorly gauged watershed in Myanmar was better than the TRMM-era 3B42V7 product at sub-daily scales. However, they suggested that the IMERG products' accuracy needs to be improved to be used in flood control and disaster mitigation in ungauged basins. Yang et al. (2020) found that the TRMM 3B42 and IMERG Final products had adequate performance in estimating monthly rainfall, while daily rainfall was better estimated by the IMERG products than the TRMM products over the Shuaishui River Basin, China. However, they suggested that further improvements were needed for hourly rainfall estimates to be used in real-world applications.
The higher resolution of the recent satellite-based precipitation products is much needed in hydrologic applications. For example, fully distributed hydrologic models can employ the high spatiotemporal resolution rainfall products for simulations over ungauged and sparsely gauged basins, especially in developing countries (Meskele and Moradkhani 2009). Chintalapudi et al. (2012) simulated flood events over the Guadalupe watershed in the USA using three types of rainfall products. They concluded that some satellite products provided runoff predictions comparable to those estimated using calibrated radar rainfall measurements. Thom et al. (2017) simulated runoff over the Srepok River watershed in Vietnam using satellite precipitation products and four gridded rain gauges as input to a hydrologic model. They concluded that both TRMM estimates and the rain gauge observation can be used in water resources management applications and driving hydrological models in datascarce areas. Li et al. (2017) examined the hydrological utility and uncertainty of the IMERG products relative to gauge and gauge-corrected radar products over the Ganjiang River basin, China. They suggested that satellite products need significant improvement before trusting them in hydrologic applications. Tan et al. (2018) evaluated the three IMERG products over Malaysia's Kelantan River Basin and concluded that the three IMERG products had great potential for hydrometeorological applications. Zhang et al. (2019) concluded that the IMERG products' performance in hydrologic applications is better than TRMM products compared to simulations driven by rain gauge observations. They concluded that the IMERG and TRMM precipitation estimates were adequate as input to a conceptual hydrological model for humid basin simulation in China. Jiang et al. (2010) found that the performance of CMORPH in simulating runoff over the Laohahe River basin was better than TRMM, while TRMM-3B42 was better than PERSIANN in runoff simulation over the Luanhe River basin, China (Ren et al. 2018). Zeweldi and Gebremichael (2011) found that CMORPH rainfall products performance was equivalent to that of rain gauges when used to develop a hydrologic model of a small watershed located in northern Mississippi, USA.
Several hydrometeorological studies have been conducted in the Arab Peninsula to investigate the quality and hydrologic worth of satellite rainfall products. Almazroui et al. (2012) evaluated TRMM rainfall data from 1998 to 2009 over Saudi Arabia. Although assessment of the TRMM products showed variations in its accuracy, it was recommended to be used for the country to supplement the lack of rain gauges. Tekeli (2017) reported encouraging results in detecting an extreme rainfall event in Jeddah, Saudi Arabia, using TRMM 3B42RT. Mahmoud et al. (2018) validated the three IMERG products using ground rainfall observations at daily and event time scales over Saudi Arabia. They recommended the IMERG Final product to complement or replace ground precipitation observations for poorly gauged and ungauged regions. A similar conclusion was reached in a study conducted in the United Arab Emirates (Mahmoud et al. 2019) and recommended using the IMERG near-real-time product in early flood warning systems. Alsumaiti et al. (2020) assessed the three IMERG products and CMORPH for the period 2010-2018 over the United Arab Emirates. They reported that the two products can improve the temporal resolution filling the spatial gaps in rainfall observations. In this study, the three IMERG products, PERSIANN, and TRMM precipitation products were assessed over a rapidly urbanized arid watershed in Saudi Arabia. All rainfall products were used as input to the Gridded Surface Subsurface Hydrologic Analysis (GSSHA) model to simulate hydrographs and runoff which were then evaluated by comparison with the observed discharge.

Study area
Located between latitudes 18° 15′ 00″ a 23° 50′ 00″ N and longitudes 38° 42′ 00″-43° 47′ 00″ E (Niyazi et al. 2020), the province of Makkah occupies the southwestern Hejaz region of Saudi Arabia with an area of 153,128 km 2 and a population of around 8,557,766 (GAS 2020). The partially urbanized watershed with a drainage area of 1725 km 2 that encompasses most of Makkah City, as shown in Fig. 1, was selected for this study. In recent decades, Makkah has witnessed extraordinarily rapid urbanization, making it the third-largest densely populated metropolitan center in Saudi Arabia and one of the world's fastest-growing cities. The increase in the urban fraction was threefold between 1992 and 2013 in Makkah Province (Alahmadi and Atkinson 2019). Similarly, Makkah City witnessed an increase in the urban area from 12 to 22.13% between 1992 and 2016 (Al Jabri and Alhazmi 2017). The urban areas exposed to flooding in Makkah City increased by 25 folds between 1988 and 2019 (Abdelkarim and Gaber 2019), and more than half of Makkah's road network is now prone to the impact of high floods (Al-Baroudi et al. 2013).
The annual average rainfall of Makkah is about 101.2 mm (Dawod and Mirza 2012), mostly attributed to rare highintensity storm events resulting in flash floods in this area. These flash floods produce massive amounts of water that pass around and through the city. The most memorable devastating extreme flood events occurred in Makkah in 1941Makkah in , 1969Makkah in , 2005Makkah in , 2008Makkah in , 2010Makkah in , 2014Makkah in , 2015Makkah in , and 2018. The occurrence of these infrequent floods caused losses in human life and extensive property damage. These infrequent floods can partially be attributed to the location of Makkah, which includes valleys, hills, and mountains with steep slopes (Fig. 1). The city's rapid development from a medium-sized to a large city helped increase impervious areas significantly. The occurrence of these flooding events and the rapid urban expansion prompted the city to require extensive flood impact studies before any development is approved.

Soil type and land use
Soil type data for the study area were downloaded from the SoilGrids™ global digital soil mapping system (www. SoilG rids. org) as mass fractions of clay, silt, and sand in percentages. Accordingly, we constructed the soil type map for the Makkah watershed using ArcGIS and soil texture classification of the US Department of Agriculture (USDA), Natural Resources Conservation Service (NRCS) (https:// www. nrcs. usda. gov/ wps/ portal/ nrcs/ detail/ soils/ surve y/? cid= nrcs1 42p2_ 054167). The developed map was compared with the available aerial photographs of Makkah city and a site investigation and geotechnical evaluation conducted by Khairy et al. (2010) in the Makkah watershed. The final soil type map is shown in Fig. 2. We obtained the land use/ cover map at 300 × 300 m spatial resolution for the Makkah watershed from the OpenLandMap data portal (www. openl andmap. org), and adjustments were made after comparison with aerial photographs. The final land use/cover map is shown in Fig. 2.

Storm events and precipitation products
In this study, four storm events were selected to assess and compare five satellite precipitation products' performance. The products include the three IMERG precipitation products, PERSIANN-CCS, and TRMM. Limited rain gauge data for the four storm events were obtained from the Ministry of Water and Electricity (MOWE). Figure 3 shows that five out of twelve rain gages are located inside the study area. Few of these rain gages recorded the storm events that were used to estimate rainfall for the Makkah watershed.
The spatial and temporal resolution of TRMM Multi-Satellite Precipitation Analysis TMPA (3B42) is 0.25° × 0.25° (27 × 27 km) and three hours, respectively. The most recent version of the product (3B42V7), which comprises near-real-time and research-grade products (Huffman et al. 2007;Huffman and Bolvin 2013), was used in this study. The TMPA approach calibrates IR-derived estimates with microwave (MW) data and generates estimates that include MW-derived rainfall estimates when and where MW data are available, as well as calibrated IR estimates when MW data are not available (Huffman et al. 2007). Several recent studies (e.g., Bharti and Singh 2015) demonstrated that TMPA 3B42V7 performs significantly better than earlier versions of the product (TMPA 3B42V6 and earlier).
The PERSIANN product uses infrared image data to compute rainfall by artificial neural networks (Hong et al. 2004). The PERSIANN technique employs a neural network methodology to develop associations between IR and MW data, which are then applied to the IR data to estimate rainfall. Sorooshian et al. 2000). The PERSIANN-Cloud Classification System (PERSIANN-CCS) is a real-time global highresolution (0.04° × 0.04°) (4.5 × 4.5 km) product developed by the Center for Hydrometeorology and Remote Sensing at the University of California, Irvine (Hsu et al. 2013). The product algorithm enables the categorization of cloud-patch features based on the areal extent, cloud height, and variability of texture estimated from satellite imagery and applies variable threshold cloud segmentation algorithm making it possible to assign rainfall values to pixels within each cloud based on a specific curve describing the relationship between rain rate and brightness temperature. The Global Precipitation Measurement (GPM) is the follow-up mission of the TRMM. The GPM was developed by NASA (The National Aeronautics and Space Administration) and JAXA (the Japan Aerospace Exploration Agency). It is composed of one Core Observatory satellite and carries about 10 partner satellites, a dual-frequency radar, and a multi-channel microwave imager (GPM 2018;Hou et al. 2014). The GPM IMERG algorithm integrates satellite retrieval from the CMORPH, TMPA, and PERSIANN . The 2014 version of the Goddard Profiling Algorithm (GPROF2014) was used firstly to process these input datasets. Then, using the Climate Prediction Center's (CPC) Morphing-Kalman Filter (CMORPH-KF) Lagrangian time interpolation methodology and the PER-SIANN-Cloud Classification System (PERSIANN-CCS) recalibration methodology, the result was re-gridded into half-hourly 0.1° 0.1° (11 × 11 km) scales (Tan et al. 2017). Finally, to improve the accuracy of the product, the monthly Global Precipitation Climatology Centre (GPCC) producer was used to perform a bias adjustment . A bilinear interpolation approach was used to convert the original GPCC product with 1° spatial resolution to the IMERG 0.1° resolution.
The IMERG for GPM includes three products: the "Early" and "Late" multi-satellite products, approximately 4 h and 14 h after observation, respectively. Climatological coefficients are used to calibrate these two products. The third product is the ''Final" run product approximately three months after observation (Huffman et al. 2018). This product is adjusted based on satellite gauge combined monthly data. These products have a temporal resolution of half-hour and spatial resolution of 0.1° × 0.1°. Version 6 of the processing algorithm products (IMERGV06) is used in this study.

Gridded surface: subsurface hydrologic analysis (GSSHA) model
GSSHA is a physically based, distributed parameter, hydrologic model that simulates hydrologic, hydraulic, water quality, and sediment transport. GSSHA incorporates onedimensional flow (channel flow and unsaturated flow) and two-dimensional flow (overland flow and groundwater flow), which are simulated on a structured grid. The model employs finite volume and finite difference techniques to solve transport equations and uses 1D diffusive-wave channel routing and 2D diffusive-wave overland flow routing ( (Downer et al. 2002). The major hydrologic processes that GSSHA can simulate are precipitation, precipitation interception, overland water retention, snowfall accumulation, and melting, overland flow routing, infiltration, exfiltration, channel routing, evapotranspiration, lateral groundwater flow, Stream/groundwater interaction, and soil moisture in the vadose zone (Downer 2008;GSSHA Primer 2018). Infiltration in GSSHA can be simulated using four options: Richard's equation, Green and Ampt (GA), multi-layered GA, and Green and Ampt with Redistribution (GAR) (Richards 1931;GSSHA Primer 2018). The flowing equation is used by the GAR method, which is used in this study: Overland flow routing can be computed using one of three numerical techniques: ADE-prediction-correction (PC), explicit, and alternative direction explicit (ADE). Selecting one of these techniques is controlled by the type of catchment. Equations (2) to (7) are used to calculate flow in the ADE scheme used in this study.
The inter-cell flows are calculated using Eqs. (2) and (3)  along x and y-directions.
Equation (4) is used to compute the flow depths in each cell at n + 1 time level in the x-direction.
The interflows are calculated using Eq. (5) in the y-direction The updated column depths are calculated based on the interflows in the y-direction using Eq. (6) where p ij and q ij -overland flows from cell ij in the x and y directions, respectively, Δx = Δy-cell's dimensions, n-Manning's roughness coefficient, d ij -the depth of water in cell ij at the nth time level, S fx and S fy -the water surface slopes in the x and y directions, respectively.
Manning's equation (Eq. 7) is used to compute the head discharge to rout the channel flow. (1) where Q n i+1 = intercell flow, n = channel roughness, S f = the friction slope in x direction, A = cross section area (m 2 ), and R = hydraulic radius. Equation (8)

Comparison of satellite precipitation products
The satellite precipitation products used have different temporal and spatial resolutions. The finest temporal resolution of half-hour is provided by the three IMERG products (Early, Late, Final), and the PERSIANN CCS provides the finest spatial resolution of 0.04° (approximately 4 × 4 km).
On the other hand, TRMM TMPA has the coarsest spatial and temporal resolutions of 0.25°, which is about 25 km and 3 h. The Inverse Squared Distance interpolation method was used to smooth the rainfall maps for all satellite precipitation products to facilitate visual comparison in Fig. 4. Only a few of twelve rain gauges were operating during the storm events studied. All satellite precipitation products captured the four storms. The spatial distribution of the total satellite-based rainfall for the 13 February 2010 event is shown in Fig. 4. According to the satellite estimates, the event lasted for about 14 h with the five products reporting different peak rainfall times. The three IMERG products show that the highest rainfall Fig. 4 Rainfall maps for the 13 February 2010 storm event as estimated by the five satellite products amounts were recorded in the north portion of the watershed. This spatial distribution is quite different for those estimated by the other two products, especially when compared to the Early and Late products. TRMM TMPA estimates show that the north-western portion of the watershed recorded the highest rainfall amounts. In contrast, the PERSIANN-CCS product estimates the highest amount over the middle portion of the watershed.
Makkah J114 rain gauge (Fig. 3) recorded the total rainfall accumulation (27 mm) for the 13 February 2010 storm event. The total accumulations recorded by the satellite products at the location of the gauge are shown in Table 5. The total rainfall accumulation for this event was estimated at 55 mm in Wadi Uranah (Al-Baroudi et al. 2013). The collocated total rainfall estimated by the three IMERG (Early, Late, and Final), TMPA, and PERSIANN-CCS products is shown in Table 5. Apparently, the Early and Late were the closest to the total rainfall accumulation reported by Al- Baroudi et al. (2013), while the IMERG Final, TMPA, and PERSIANN-CCS products reported the highest underestimation of this event. The estimated peak rainfall by PER-SIANN-CCS occurred one hour before the three IMERG Early products and one hour after the TMPA (Fig. 5).
As expected, there was a very strong correlation among the three IMERG products, while PERSIANN-CCS has weak correlations with the Early and Late products and a very weak correlation with the Final product (Table 1). TMPA had a moderate correlation with the three IMERG products and the PERSIANN-CCS product. Figure 6 shows an agreement in the spatial distribution of total precipitation of 30 December 2010 between the IMERG Early and Late products. The other products showed lower rainfall amounts and different spatial patterns. Only two rain gauges recorded the total rainfall accumulations for this event: Makkah J114 and Arafah 9004, with rainfall totals of 44.5 and 12 mm, respectively, making a watershedaveraged total rainfall of 22.11 mm based on interpolation using the Inverse Squared Distance method. However, Bastawesy et al. (2012) reported a watershed-averaged total rainfall of 51 mm for the event. Table 5 shows the reported total accumulations by the satellite products at the location of Makkah J114 and Arafah 9004. The watershed-averaged total rainfall estimated by the three IMERG (Early, Late, and Final), TMPA, and PERSIANN-CCS products is shown in Table 5. The Early and Late were the closest to the total rainfall accumulation reported by Bastawesy et al. (2012). The temporal distribution of the rainfall varies among the satellite products. Both PERSIANN-CCS and TMPA have significantly different temporal patterns compared to the three IMERG products (Fig. 7). As shown in Table 2, the IMERG products were highly correlated while the TMPA negatively correlated with all other products, which means that the spatial distribution of this product was significantly different than the other products as shown in Fig. 6.
The spatial distribution of total rainfall of the 3 November 2018 event is shown in Fig. 8. Three rain gauge stations  recorded the total rainfall accumulations for this event: Makkah J114, Arafah 9004, and Muntasaf-Huda J205 with rainfall totals of 7.8, 22, and 20 mm, respectively. The reported total accumulations by the five satellite rainfall products at the locations of the three gauges are shown in Table 5. The watershed-averaged total rainfall observed by the gauges was 18.36 mm using the Inverse Squared Distance method. The average total rainfall estimated by the three IMERG (Early, Late, and Final), TMPA, and PERSIANN-CCS products is shown in Table 5. It can be noticed that the spatial patterns  are significantly different among all the products (Fig. 8). Figure 9 shows significant variations in the magnitudes and the temporal patterns between the IMERG Early product and the other products except for the Late product with a very strong correlation coefficient of 0.97. Table 3 shows the correlations between the products. Again, TMPA is negatively correlated with all other products, which showed spatial distribution significantly different than the other products as shown in Fig. 8. The spatial distribution of watershed-averaged total rainfall of the 23 November 2018 event is shown in Fig. 10.
Four rain gauge stations recorded the total rainfall accumulations for the event: Makkah J114, Arafah 9004, Al-Ferine J113, and Muntasaf-Huda J205 with rainfall totals of 55, 1.7, 9, and 14.2 mm, respectively. The total accumulations recorded by the satellite products at the location of the gauge are shown in Table 5. The watershed-averaged total rainfall observed by the gauges was 15.5 mm using the Inverse Squared Distance method. The average total rainfall estimated by the three IMERG (Early, Late, and Final), TMPA, and PERSIANN-CCS products is shown in Table 5. This is the only event that showed reasonable agreement among the five products (Figs. 10, 11, Table 4) ( Table 5).

Model setup
The GSSHA model (Downer andOgden 2004, 2006) was used to simulate selected storm events over the Makkah watershed. The Watershed Modeling System software (Aquaveo 2020) and ArcGIS were used to conduct model input and output pre-and post-processing. The digital elevation models (DEMs) for the watershed with 10-m resolution were obtained from King Abdulaziz City of Science and Technology (https:// www. kacst. edu. sa/). The watershed and  . 8 Rainfall maps for the 3 November 2018 storm event as estimated by the five satellite products stream network were delineated using the TOPAZ tool in WMS. The stream network was compared to aerial photographs of the watershed, and minor adjustments were made. To avoid errors, WMS was also used to perform filling and smoothing processes on 150 grid sizes. The Cleandam tool in WMS was used to remove digital dams, fill depressions, and pits in the original DEM. GSSHA uses equally sized square grid cells for hydrologic, hydraulic, sediment, and water quality simulations. All processes are simulated over each grid cell resulting in a fully distributed-parameter hydrologic model. The 2D overland flow was calculated using the Alternative Direction Explicit (ADE) method, and the 1D channel flow from each grid cell was calculated using diffusive wave equations. Green and Ampt with redistribution method (Ogden and Saghafian 1997) were used to calculate infiltration, and the values of hydraulic conductivity were taken from Rawls et al. (1983). The hydrologic parameters used in this study are shown in Table 6. Estimating the antecedent soil moisture was a challenge due to lack of field observations as it is not only tied to meteorological conditions, land cover, and soil properties, but also to subsequent land use management practices, and particularly soil compaction (Gregory et al. 2006;Shi et al. 2007;Pouyat et al. 2010;Smith et al. 2013). In light of this uncertainty and for simplicity, we assumed the initial soil moisture to represent normally dry conditions in this study, as shown in Table 7. Cross-sectional geometries were extracted from DEM, while the trapezoidal shape was used for streams with surveyed cross-sectional data. Manning's roughness coefficients were mapped to the available land use/cover classes based on the values estimated by Sharif et al. (2010a, b). The grid resolution was selected to be 150 m, and the simulation time step was 30 s. As a result of steep slopes, the high degree of impervious cover, and the fast response times in urban areas, runoff is mainly infiltration excess (Hortonian) runoff. Accordingly, subsurface runoff and groundwater flow were neglected. A roughness values range between 0.02 and 0.05 was assigned to stream channels, while the overland roughness values are shown in Table 8.

GSSHA model simulation
Like many other areas in the developing world, the study area suffers from severe data scarcity. However, two of the events discussed have one peak discharge estimate based on high water marks observations at interior points. The peak discharge for the 13 February 2010 was estimated at the outlet of Wadi Uranah sub-watershed (610.8 km 2 ) and at the outlet of Wadi Al-Nu'man sub-watershed (683.4 km 2 ) for the 3 November 2018 event (Fig. 1). These measures of peak discharge were based on the highest observed flow depth and channel characteristics; however, no accurate estimate of the peak timing was available. Accordingly, one performance statistic which is the error in peak discharge (ε p ) was used to assess the discharge predicted by the GSSHA model based on the equation described below.    where P o is the observed peak discharges and P s is the simulated peak discharges. The 13 February 2010 storm event peak was estimated at 431 m 3 /s based on high-water marks (1.6 m) through  Fig. 12. The simulated peak discharge based on the IMERG Early product matched the observed discharge almost perfectly with a peak discharge error (ε p ) of about 1% (Table 9, Fig. 12). The peak discharge based on the PER-SIANN-CCS resulted in the highest peak discharge error (78.47%). Also, it can be noticed that the IMERG Final  product and simulated hydrographs significantly underestimated the peak discharge while the IMERG Late product underestimated the peak discharge by just 12.72%. The peak for the 3 November 2018 storm event was estimated to be ranging from 640 to 680 m 3 /s based on an observed maximum flow depth of about 3.5 to 4 m measured at three points by a consultant at the outlet of Wadi Al-Nu'man near the University of Umm Al Qura in Makkah (Abdelkarim and Gaber 2019).
The IMERG Early product also produced the best matching hydrograph at the outlet of Wadi Al-Nu'man for the 3 November 2018 storm event with a peak error of under 1%. The IMERG Late product performed as well with a peak error of 4%. The highest errors in the peak discharge were produced by the TMPA with a peak discharge error of 95.36% (Table 9, Fig. 13). The significant variation in the simulated hydrographs among the satellite precipitation products is driven by the large variation in rainfall estimates (Fig. 9). Table 8 shows the performance statistics for each product for the two storm events. Figure 14 shows the hydrographs at the outlet of the entire Makkah watershed for all four storm events driven by the five satellite products. For all events, the IMERG Early product produced the largest runoff volume and peak discharge followed by the Late product. On the other side, the IMERG Final, PERSINANN-CCS, and TMPA products produced significantly lower runoff volumes and peaks.

Propagation of rainfall errors
The results shown in Figs. 12 and 13 indicate that IMERG Early produced hydrographs matched the limited field observations of the peak discharge better than all those produced by all other products. To quantify the propagation of satellite rainfall errors (differences) into runoff predictions over the Makkah watershed, we use the IMERG Early estimated rainfall as the reference rainfall and the simulated hydrograph based on this product as the reference hydrograph. For spatially averaged rainfall, we compute the random and systematic errors (differences) of the other four satellite products by calculating the centralized root mean square error (CRMSE) and relative mean error (RME). RME is a measure of systematic errors of the product. It is usually one-directional (under-or over-estimation), while CRMSE quantifies random errors that depend on the sensor sampling   (Derin et al. 2019). We use the same error statistics to estimate the influence of rainfall error on the hydrologic model response. Table 10 shows the detailed formulae for MRE and CRMSE measures. The CRMSE and MRE were computed for all storm events and all satellite rainfall products, as shown in Tables 11 and 12. As can be seen from Fig. 15, the IMERG Late has the lowest RME for all storm events. The calculated RME values of the discharge estimates were significantly much higher than those calculated for the rainfall values, which confirm the effect of propagation of precipitation error on the hydrologic response. In general, the TRMM product has the highest underestimation (− 0.58)    estimates. Figure 16 shows the CRMSE of the satellite rainfall values and the resulted discharges. Overall, for all rainfall products and all events, rainfall errors (compared to the reference product) were significantly amplified in runoff predictions.

Infiltration distribution
The spatial distribution of cumulative infiltration for the 3 November 2018 storm event has a similar pattern for all the precipitation products except for minor differences (Fig. 17).
As expected, the cumulative infiltration was high in sand and sandy loam soil. The IMERG Early, IMERG Late, and PER-SIANN-CCS products resulted in very similar infiltration   (Fig. 17). However, The IMERG Late product had a higher cumulative infiltration depth than the Early product, which also had a higher cumulative infiltration depth than the other satellite products. The Final product resulted in high cumulative infiltration in the middle and the upstream of the study area, while TRMM showed high cumulative infiltration in the upstream of the watershed. Table 13 shows that the relationship between rainfall volume and infiltration is not linear and depends on the variability of rainfall and soil properties. The lower percentage of infiltration rainfall was 53.16% produced by the Late product with a rainfall volume of 83,142,661.2 m 3 . In comparison, the highest percentage of infiltration rainfall was 76.84% produced by the TRMM product with a rainfall volume of 18,617,243.7 m 3 as shown in Table 13.

Conclusions
This study evaluated the performance of the three IMERG, TRMM, and PERSIANN-CCS rainfall products over 1725 km 2 arid watershed using rain gauges' observations of four storms. Moreover, hydrologic model simulations driven by the five satellite rainfall products were conducted to compare different hydrographs. A physically based, fully distributed hydrologic model was used to simulate runoff to highlight the effect of the interaction of the variability rainfall fields and watershed properties. Overall, the comparison results of the watershed-averaged rainfall showed that the Early and Late showed better precipitation comparison among all satellite precipitation products. The correlation coefficient was strong among the three IMERG products, especially the Early and Late products, while the TMPA and PERSIANN-CCS products showed a moderate correlation among them and a mostly weak correlation with the three IMERG products. The estimated peak rainfall by three IMERG products occurred almost at the same time, while the TMPA and PERSIANN-CCS products showed different times. The spatial and temporal distribution of the rainfall varied among the satellite products for all the storm events except the Early and Late products, which showed a similar distribution in most of the storm events.
The Early and Late products showed lower errors in estimating flood depths and peak discharge for the two events that had peak discharge estimates. However, their accuracy is associated with significant uncertainties, especially the differences among the products in the spatial patterns. The significant variation in the simulated hydrographs among the satellite precipitation products can be attributed to the large variation in rainfall estimates. This might be related to the inherent errors in the techniques adopted within the satellite rainfall estimation and the limited number of ground stations used to calibrate these products. Moreover, the satellite product rainfall calibration typically performed on a monthly basis is not accurate in using this data to simulate a single storm event with a duration of less than one day and sometimes a few hours as the case in the study area.
The limited peak discharge observations indicated that the IMERG Early run product produced the best hydrographs. Therefore, it was used as a reference to quantify the propagation of satellite rainfall errors into runoff predictions over the Makkah watershed. The results clearly indicated a significant amplification in the runoff prediction due to systematic and random rainfall errors. On the other hand, it was found that the relationship between infiltration and rainfall volume is not linear and depends on the variability of rainfall and soil properties.
In summary, the IMERG Early product outclassed the IMERG Final product and the other satellite products in all the criteria. The study area is characterized by high spatial and temporal variation in the rainfall pattern due to the topographic complexity of the Makkah region. For such an environment, it is ideal to have several ground stations with enough temporal and spatial resolutions to enable proper verification of the satellite rainfall estimate. Moreover, the calibration at a daily or sub-daily scale should be taken into consideration instead of monthly. Overall, all the satellite precipitation products captured the selected storm events in this study, and these products provided some helpful information. However, the IMERG Early and Late products can only be used in a region with sparse or non-existent rainfall data but with caution, especially in an application that required high-resolution rainfall data.
Acknowledgements The authors would like to thank King Fahd University of Petroleum and Minerals for support during this work.
Author contributions AAl-A, MAl-Z, and HS developed the research methodology. AAl-A downloaded and processed the remote sensing products. AAl-A, MA-Z, and HS developed the model and performed calibration and validation. AAl-A prepared the first draft. MAl-Z and HS performed the final overall proofreading of the manuscript.
Funding The authors declare that this work has been done in the absence of any external source of funding.

Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.