1 Introduction

Precipitation is one of the most important climate elements that directly affect the availability of water resources (Randall et al. 2007). Also, regionalization of precipitation regime is essential for the optimum design and management of water related activities. Spatial distribution and temporal trend of precipitation have been widely studied in some regions, such as in the Mediterranean areas (Rodrigo and Trigo 2007; Martínez et al. 2007; Longobardi and Villani 2009; Beguería et al. 2009; Mourato et al. 2010; López–Moreno et al. 2010), in Iran (Modarres and Sarhadi 2009; Tabari and Hosseinzadeh Talaee 2011; Raziei et al. 2012, 2014; Soltani et al. 2012; Dinpashoh et al. 2014), in India (Guhathakurta and Rajeevan 2008; Jain and Kumar 2012; Dimri and Dash 2012; Jhajharia et al. 2012) and in Africa (Cheung et al. 2008; Senapathi et al. 2010; Kassile 2013). Likewise, precipitation regimes have been studied in different regions such as Sri Lanka (Puvaneswaran and Smithson 1993), southwest United States and northern Mexico (Comrie and Glenn 1998), Nepal (Kansakar et al. 2004) and China (Chen et al. 2009; Xiao et al. 2013).

There are several methods for assessing climate regionalization, including principal component analysis (PCA), hierarchical clustering analysis (HCA), empirical orthogonal function (EOF), and simple correlation analysis (Chen et al. 2009). The purpose of PCA is to identify the most important correlation structures between a number of variables in order to obtain a description of the major part of the overall variance with few linear combinations based on the original variables (Muñoz–Díaz and Rodrigo 2004). Arriaga–Ramírez and Cavazos (2010) applied the rotated PCA on the monthly precipitation S–mode data (multiple stations over time) to produce precipitation regions in northwest Mexico and the southwest United States. The retained PC scores corresponding to R–mode (multiple times over station) or S–mode are compounded. PCAs can be subjected to cluster analysis (CA) to better identify different sub–regions (e.g., Marzban and Sandgathe 2006). The CA technique, similarly to PCA method, is known for its ability to divide the dataset into homogeneous and distinct groups, having members with similar characteristics (Shukla et al. 2000). The CA is one of the most useful tasks in the data mining process for discovering groups and identifying interesting patterns in the underlying data (Halkidi et al. 2001). Some significant applications of PCA in climate research include those by Yu et al. (1997), Monahan (2000), Hsieh (2001), Cannon et al. (2002), Camberlin and Diop (2003), Giannini et al. (2003), Astel et al. (2004), Muñoz–Díaz and Rodrigo (2004) and Jhajharia et al. (2013).

There are hierarchical and non–hierarchical methods for cluster analysis. Hierarchical methods are based on a distance matrix. Euclidean distance is the most commonly used measure, although many other distance measures exist (Gong and Richman 1995). In Iran, Domroes et al. (1998) have used PCA and HCA for mean monthly precipitation estimation based on 71 stations, resulting in five homogeneous precipitation regions. Dinpashoh et al. (2004) have regionalized Iran’s precipitation climate using multivariate methods resulting in six homogeneous precipitation regions. Soltani et al. (2007) have used monthly precipitation time series from 28 main sites to determine regional rainfall climates of Iran. A hierarchical cluster analysis was applied to the autocorrelation coefficients at different lags, and three main rainfall climatic groups were found. Raziei et al. (2008) applied PCA on the nine seasonal and annual precipitation–derived parameters of 140 stations in western Iran, covering the period 1965–2000. Applying the CA to the four varimax rotated PC scores suggested that five spatially homogenous sub–regions exist, characterized by different precipitation regimes.

All these studies have used station–based data to regionalization of precipitation regimes without any gridded datasets. Understanding spatial variation in climatic conditions is a key to many agricultural, hydrological, natural and water resources management studies. However, meteorological and climatic variables are usually measured at meteorological stations, and the data are only valid for the point of measurement. To overcome this problem, the high–resolution gridded daily precipitation dataset from Asian Precipitation Highly Resolved Observational Data Integration Towards Evaluation of Water Resources (APHRODITE) database were used to the regionalization of Iran’s precipitation regime. This database is being constructed by APHRODITE’s water resources project in collaboration with the Research Institute for Humanity and Nature and the Meteorological Research Institute of the Japan Meteorological Agency. The APHRODITE project has released a high–resolution daily gridded precipitation dataset covering the Eurasian continent to develop state–of–the–art daily precipitation datasets on high–resolution grids (Yatagai et al. 2009). The data used in APHRODITE analysis were: (1) GTS–based data (the global summary of the day); (2) data precompiled by other projects or organizations; and (3) APHRODITE’s own collection (Yatagai et al. 2012). The aim of the present study is to regionalize precipitation regimes in Iran by using principal component analysis and clustering analysis.

2 Data and Methods

Iran is located in the southwest of Asia. Its surface area is about 1,648,000 km2 and is approximately limited between latitude 25°N to 40°N and longitude 44°E to 64°E. The two high mountain ranges of Zagros and Alborz in west and north, respectively, strikingly affect the temporal and spatial patterns of precipitation, and prevent much of the available moisture into the interior deserts (Alijani et al. 2008). Spatiotemporal precipitation patterns in Iran are significantly variable in attribution to complex terrains (i.e., mountains range, central deserts and coastal areas), geographic latitude, and several air masses that influence Iran throughout the year (Nazaripour and Mansouri Daneshvar 2014). The seasonal distribution of precipitation is controlled by the interaction of the tropical air mass from the Red Sea, the Mediterranean low pressures, the Siberian high pressure and the western passenger high pressures.

The APHRODITE daily precipitation gridded data for Middle East within 1951 to 2007 was extracted via APHRODITE’s Water Resources web site (http://www.chikyu.ac.jp/precip/products). This dataset covers an area of 20 E to 65 E and 15 N to 45 N, and has a resolution of 0.25° × 0.25°, which has been clipped in 2,491 pixels over the Iran surface area (Fig. 1a). The distribution of annual long–term mean precipitation in Iran is shown in Fig. 1b. It can be seen that the Zagros and Alborz Mountain ranges have an important role in distribution and received precipitation in the country. The maximum annual received precipitation observed in northern Iran on the southwestern coast of the Caspian Sea, where the mean annual precipitation is over 1,300 mm. In addition, on the Zagros Mountains, the mean annual precipitation is about 700 mm. The lowest mean annual precipitation, less than 100 mm, occurs in the central and southeastern parts of Iran.

Fig 1
figure 1

a The clipped APHRODITE dataset on the elevation terrain map of Iran containing 2,491 pixels with resolution of 0.25° × 0.25°; b Mean annual precipitation (mm) over Iran in years 1951 to 2007

The objective of principal component analysis (PCA) is to reduce the number of predictive variables and transform them into new variables, called principal components (PC). To better elucidate the influence of each original variable on the PCs, a rotational algorithm, such as the varimax rotation, was applied to obtain the rotated factor loadings (Jhajharia et al. 2013). Also, to apply the hierarchical clustering, the Euclidean distance measure for the observations and Ward’s method for the linkage rule were used. Among the various distance measures and linkage rules that can be used in hierarchical cluster analysis, this combination has been shown to yield the most distinctive groups (Mansouri Daneshvar 2014).

3 Results and Discussion

3.1 Interpretation of Mean Seasonal Precipitation Data

In this study, we considered seasonal precipitation data with eight derived variables (Figs. 2 and 3). In these figures, the mean seasonal precipitation amount and its percentage have been interpolated over Iran. In the winter season, the highest precipitation amounts are recorded on Zagros mountain. During winter the southeastern parts of the country receives about 65 % of total annual precipitation. The north and northwestern parts receive lower than 30 % of the total annual precipitation. In the spring season, the maximum core precipitation, with more than 200 mm, is shifted to the north and northwestern area (Fig. 2a to b). The spring precipitation decreases from the northwestern areas towards the southern regions. During this season, some regions of the northwestern part receive about 40 % of the total annual precipitation (Fig. 2c to d). Due to subtropical high pressure, except of the northern areas and some regions in the southeastern part, summer is the driest season in Iran. The contribution of summer precipitation to the total annual precipitation in the southwestern coasts of the Caspian Sea and southeastern areas of Iran are about 24 and 21 %, respectively (Fig. 3a to b). The spatial and temporal share of autumn precipitation extends over the whole surface of Iran. Over 42 % of the total annual precipitation occurs in the autumn season. The highest value of autumn precipitation occurs in the southwestern area of the Caspian Sea. The lowest contribution of autumn precipitation to the total precipitation is observed in the mid–eastern areas of Iran (Fig. 3c to d). According to the abovementioned statements, we can claim that the Iranian precipitation regimes present spatial variation. The northwestern areas receive most of the precipitation in the spring, while in the southeastern parts of the country most precipitation is recorded in the winter. At the southwestern parts of the Caspian Sea, the maximum precipitation percentage is observed in the autumn. The contribution of summer precipitation to the total received precipitation is high in the southeastern part of the country. In the present study, we focus on regionalizing the precipitation regimes in Iran, in order to detect homogenous sub–regions by using principal component analysis and hierarchical cluster analysis.

Fig. 2
figure 2

Mean seasonal precipitation: a winter precipitation amount in mm; b the percentage of winter precipitation distribution; c spring precipitation amount in mm; d the percentage of spring precipitation distribution

Fig. 3
figure 3

Mean seasonal precipitation: a summer precipitation amount in mm; b the percentage of summer precipitation distribution; c autumn precipitation amount in mm; d the percentage of autumn precipitation distribution

3.2 Description of the Principal Component Analysis

Data analysis first included calculation of the seasonal precipitation amount and percentage into each grid cell. Then, the S–mode of PCA was applied on the correlation matrix of seasonal precipitation with eight derived variables which have been mapped using GIS in Figs. 2 and 3. Use of the correlation matrix, as opposed to the covariance matrix, allows dry grids in the deserts to be directly compared to relatively wet grids in the mountains (Comrie and Glenn 1998). The initial statistics are presented in Table 1, and the selection of principal components was based on the eigenvalues over one. The retained components were subjected to varimax rotation to enhance interpretability. The advantage of a varimax rotation is that it keeps the principal components uncorrelated (Jolliffe 1986; Wilks 1995; Paschalidou et al. 2009). An orthogonal rotation based on varimax criterion can be widely applied to obtain more spatially localized and uncorrelated principal components (Storch and Zwiers 1999). White et al. (1991), in their comparison of rotation techniques, found that oblique rotations generally produced the best results for climate regionalization. Miller and Goodrich (2007) found little difference between the oblique and orthogonal rotation methods, and preferred the orthogonal method because it tends to preserve the maximum loading of individual components (Cook et al. 1999; Frei and Robinson 1999; McCabe et al. 2004).

Table 1 Eigenvalues, explained variance and cumulative variance of eight principal components

In this study, we used the recommendations of Miller and Goodrich (2007) to apply the rotated orthogonal based on varimax scores to achieve eigenvector–based regionalization regime of seasonal precipitation. In this regard, about three rotated components were considered based on scree plot and eigenvalues over 1.0, which have explained 88.95 % of the total variance in the data. High loading values (>0.7) were considered as important loadings and showed good correlations between the variables and the principal components (Table 2). Then, the standardized rotated PC scores were calculated by using the regression method, as shown in Figs. 4a to c.

Table 2 Loading values of varimax rotated principal components
Fig. 4
figure 4

a Spatial distribution of rotated score of PC1; b spatial distribution of rotated score of PC2; c spatial distribution of rotated score of PC3; d spatial regionalization of precipitation regimes over Iran

The PC1 loading value, which explains 47.2 % of the total variance, has positive values on autumn precipitation, both as seasonal mean precipitation and percentage, and high positive values on winter precipitation. The standardized PC score shows high positive values in the southeastern part of Caspian Sea and Zagros mountain range, and negative values in the southeastern parts of the country. In the southwestern parts of the Caspian Sea, autumn precipitation is about six times higher than the mean precipitation in autumn. In addition, winter precipitation in the Zagros Mountains range is about four times higher than the mean precipitation in winter. The lowest PC score in the southeastern part of the country indicate low autumn and winter precipitation depth and autumn percentage of total precipitation.

The PC2 loading value, which accounts for 23.77 % of the total variance, has high positive values on spring precipitation amount and percentage. The PC2 has negative values on winter precipitation percentage. In spatial view, the positive PC2 scores are observed in the northwestern and northeastern parts of Iran. The highest values of PC2 are located on the northwest part of the country, and indicate the highest contribution of spring precipitation and low contribution of winter precipitation in the total precipitation. The high precipitation in the spring in the northwestern parts of Iran is attributed to the convective mechanism due to surface heating and availability to air humidity in the atmosphere. In contrast, the negative PC2 score over the mid–southern parts imply that the amount and contribution of spring precipitation are lower than the winter precipitation percentage. In the commencement of warm season and displacement of western wind waves northward, tropical air mass and subtropical high–pressure systems govern the region from the southward direction. As a result, the pressure gradient and instabilities that occur during the cold season decrease.

The PC3 loading value, with 17.96 % of explained variance, has positive values for amount and percentage of summer precipitation. As Fig. 4c shows, the southern parts of the Caspian Sea and the southeastern areas of Iran have positive scores. In these regions, the amount and contribution of summer precipitation is relatively high. The main source of moisture for precipitation in these areas comes from the Caspian and Oman Sea. While apart from the above two mentioned regions, the PC3 scores are negative in all other parts of Iran, indicating low amount and contribution of summer precipitation in the total precipitation. In comparison to the other PC scores, the difference between the highest and lowest PC3 scores is relatively high. This implies high spatial variability of summer precipitation.

3.3 Regionalization of Mean Seasonal Precipitation Data

To identify the homogeneous precipitation regime, we applied HCA according to three abovementioned PC scores. The classification was achieved by hierarchical, agglomerative cluster analysis using Ward’s method. Ward’s method was selected, because it typically outperforms other algorithms in terms of separation, to give relatively dense clusters with small within group variance (Mansouri Daneshvar et al. 2013). Based on the HCA, about nine distinct regions were recognized and were mapped by using GIS (Fig. 4d). In this regard, about nine groups (regions) were identified spatially through Iran country. To check the statistical significance of the identified groups (regions), the non–parametric Kolmogorov–Smirnov test was applied to group distribution with the null hypothesis that the distributions are the same. The results of the test are given in Table 3; it can be seen that most differences between the region distribution are in the 1 % level of significance at four seasons.

Table 3 The results of Kolmogorov–Smirnov test applied to check group’s distribution of seasonal precipitation percentage in the nine regions

3.4 Statistical Analysis of Precipitation Regimes in Nine Regions

In the winter season, the distributions were the same in the southern regions 1, 2 and 3, and the southern areas of Caspian Sea, regions 7 and 8 (Fig. 4d). The distribution of spring precipitation in all regions was different at the confidence levels of 99 %. Therefore, the distribution of summer precipitation was statistically different in all regions. For regions 3 and 4, the differences were not significant because the contribution of summer precipitation to mentioned regions was very low (<1 %). The distribution of autumn precipitation to the total one was different in recognized regions except for regions 1 and 2. According to the results of the Kolmogorov–Smirnov test on seasonal percentage of precipitation distribution in the regions, it can be concluded that the precipitation regimes in the regions are different. Distribution of seasonal precipitation percentage for each region is shown in Fig. 5, and the statistical data of nine regions are summarized in Tables 4 and 5.

Fig. 5
figure 5

Total percentage of seasonal precipitation within nine regions a region 1; b region 2; c region 3; d region 4; e region 5; f region 6; g region 7; h region 8; i region 9

Table 4 Mean annual and seasonal precipitation and their standard deviation (SD), maximum (Max) and minimum (Min) values in the nine regions
Table 5 Seasonal precipitation percentage of total amount and their standard deviation (SD), maximum (Max) and minimum (Min) values into nine regions

In the southeast part of the country, Region 1 extends to about 9.4 % of Iran surface area. This region is a very dry region, with an average total precipitation of 105 mm, which contains most parts of the southeast Iran. The main rainy season is observed in the winter, with about 56.7 % of the total precipitation. In the winter, this region is affected by the southern track of Mediterranean and Red Sea cyclones. The contributions of summer precipitation values to total precipitation in regions 7, 8 and 9 are obtained as the highest values in Iran. These high precipitation values are affected by southern summer monsoon.

Region 2 includes about 16 % of Iran surface area, extending in the southeastern parts of Iran. This region is another dry region in Iran with mean total precipitation of 111 mm. In accordance to most regions, the main rainy season in this region occurs in winter, with about 64 % of the total precipitation received in this season. The contribution of summer precipitation to the total precipitation is equal to 3 %.

The foothills of Zagros Mountains in region 3 extend to about 9.1 % of Iran surface area. The mean annual precipitation is recorded at 234.7 mm in this region. The maximum percentage of total precipitation falls in winter (55.3 %) followed by autumn (34.2 %). The contribution of spring and summer to total precipitation is very low (9.5 and 0.8 %, respectively).

Region 4 includes the highest part of the Zagros Mountain ranges. This region is a special region due to its high altitude and significant rainfall depths of 460.9 mm. The main rainy season in this region occurs in winter. This region is strongly affected by southwest moist spells and warm airflows coming during Red Sea low pressure, in combination with the winter southern track of Mediterranean cyclones. The contribution of summer precipitation to total precipitation is negligible (0.49 %).

Region 5 includes the largest region that covers about 29.7 % of Iran surface area. It contains the central parts of Iran. Due to shadow effects of two mountain ranges, Zagros in the west and Alborz in the north, mean annual precipitation is about 117 mm. The main rainy season occurs in the winter and the contribution of other seasons to total precipitation is relatively low.

Region 6 includes some parts extended from northwest to northeast. It covers 16.6 % of Iran surface area. The mean annual precipitation is recorded at about 231.6 mm. The main rainy season occurs in the winter with contribution of 42.9 %. The percentage of spring and autumn seasons in total precipitation is observed somewhat similar (28.5 and 25.6 %, respectively). The northwestern parts of Iran present the highest contribution to the spring precipitation and the lowest contribution to the winter precipitation.

Region 7 includes the northwest and the northeast parts of Iran, which are characterized by abundant precipitation in the spring season. This region experiences a mean annual precipitation of 296.2 mm. The contribution of spring precipitation to the total precipitation is recorded at about 35.3 %, while the percentages of winter and autumn are about 30.2 and 26.3 %, respectively. The share of summer precipitation is relatively low (8 %). The highest convective mechanism in this region is attributed to land surface heating and atmospheric air humidity in the spring season. Another reason for the maximum precipitation in spring is that westerly wind waves do not completely leave the region.

Region 8 contains the southern parts of the Caspian Sea. The mean annual precipitation is higher than all the other regions except of region 9 (509.4 mm). This region covers about 1.5 % of Iran land surface. The highest precipitation season is the autumn with contribution of 33.3 % to the total precipitation. At the end of the summer season, the Caspian Sea water is strongly heated. The Siberian high–pressure anticyclone system brings dry cold air to region 9. Then, the interaction between dry cold air masses and moist warm air over the Caspian Sea results in air turbulence, atmospheric instabilities and orographic precipitation events in the region. The Alborz Mountain range in the southern coast of the Caspian Sea blocks the southward northern airflows and supports instabilities.

Along region 9, the southwestern part of Caspian Sea is characterized by abundant precipitation. Region 9 covers only 0.8 % of Iran surface area. In this region, the mean annual precipitation is about 972 mm, with a relatively high standard deviation of 169 mm. The most abundant precipitation occurs in autumn. The contribution of summer precipitation to total precipitation is the highest one among all regions (22.4 %). The high gradient of pressures between cold air masses originating from Eastern Europe and warm air upon the Caspian Sea in autumn season results to the high precipitation amounts in the southwestern parts of the Caspian Sea.

The results of clustered regions resulting in Iran in comparison to previous works in the area (i.e., Domroes et al. 1998; Dinpashoh et al. 2004; Raziei et al. 2008) seem to compare well. The aforementioned references have investigated the precipitation clustering based on station datasets without any gridded datasets. Dataset of meteorological and climatic stations are only valid for the point of measurement. Furthermore, meteorological stations are not distributed evenly across Iran surface area, with lower density observed towards the central and eastern parts. The uneven distribution makes it difficult to spatially assess and regionalize Iran’s precipitation regime. Hence, APHRODITE project, with a high–resolution daily gridded precipitation dataset is an accurate source in precipitation clustering.

4 Conclusion

Precipitation regimes were regionalized for Iran by using Principal Component Analysis with varimax rotation based on eight seasonal precipitation–based variables during years 1951 to 2007. Three principal components were extracted based on scree plot and eigenvalues higher than one. Three rotated components, which explain 88.95 % of the total variance in the data, were used to calculate the PC scores. According to the results, the main precipitation regime is in the winter season. In some parts of southern and southeastern of Iran, more than half of the total precipitation occurs in the winter. With moving away from the mentioned regions to the north and Caspian Sea coast, the contribution of autumn precipitation to total one becomes higher than winter precipitation. The precipitation regime of northwestern parts of Iran is classified in spring season. The contribution of summer precipitation to total precipitation is noticeable in the southern parts of Caspian Sea and Southeastern areas. By applying hierarchical cluster analysis on rotated principal component scores, about nine homogenous precipitation regimes were identified. This study showed that the APHRODITE dataset potentially could be used for regionalization of precipitation regimes in Iran. According to the results, use of this dataset in order to group precipitation regimes is recommended for arid and semi–arid regions of mid–latitudes, especially in the Middle East countries, while its utilization in the humid and sub–humid regions needs to be further studied in the future.