1 Introduction

Water scarcity and inefficient water use are the main limiting factors for agricultural development and food production in Iran. Precise estimation of evapotranspiration (ET) is needed to increase water use efficiency. Evapotranspiration is the most critical parameter for climate and hydrological studies and one of the main components of the water balance of each region of Iran. Evapotranspiration utilizes around 60% of annual solar radiation received at the Earth’s surface (Wang and Dickinson 2012; Wild et al. 2013). Apart from being involved in the energy balance, ET is a significant component of the water cycle and uses about two-thirds of the rain on Earth (Baumgartrer and Reichel 1975). It also plays a crucial role in atmospheric processes, as it determines the supply of water in the atmosphere from oceans and terrestrial areas. It affects the amount and spatial distribution of global temperature and pressure (Shukla and Mintz 1982). This can affect the incidence of heatwaves (Seneviratne et al. 2006) and rainfall processes (Zveryaev and Allan 2010), and the performance of agricultural production, especially in arid and semi-arid regions.

It is widely argued that increasing temperature as a result of climate change has a direct impact on hydrological parameters such as ET (McKenney and Rosenberg, 1993). In this regard, within the period 1960–2005, the minimum and maximum temperatures at weather stations in Urmia Lake Basin (ULB) in north-west Iran increased from 0.1 to 0.5 °C and from 0.1 to 0.3 °C, respectively. This increase in temperature is leading to a decrease in lake level by accelerating the rate of water loss from Lake Urmia, which is considered to be the leading cause of desiccation of this lake. It is also leading to an increase in evapotranspiration and crop water requirements.

In hydrological studies in arid regions, it is vital to have a good understanding of the spatial variation in ET. Analysis of homogeneous areas in terms of climate characteristics, especially ungauged areas or regions with incomplete data, can improve irrigation scheduling and result in more appropriate use of water resources. Since the ET is measured as spot values, but the temperature and ET distribution on the Earth’s surface are highly variable on a spatial scale, measurement of ET offers acceptable accuracy only in small environments. It is not suitable for large environments where weather stations are less densely distributed. Given this limitation, in order to identify the spatial pattern of ET in a region, the data points need to be converted to surface data. Various types of regionalization and interpolation methods can be used to evaluate spatial changes in ET.

If weather data for many stations are studied, a statistical method for identifying homogeneous climate areas should be used. In ungauged basins, model parameters should be estimated from other sources of information. An appropriate method for setting model parameters in basins that lack data is to use model parameters from a similar hydrological basin (Merz et al. 2006). Various types of regionalization method have been introduced for transfer of parameters from a similar hydrological basin to an ungauged area. These include spatial proximity, which uses interpolation techniques based on geographical locations, or spatial distances, including Kriging, inverse distance weighting (IDW), and spline, which have been applied previously to determine spatial distributions of ET (Mardikis et al. 2005; Xu et al. 2006; Zhu et al. 2012; Kamali et al. 2015; Nam et al. 2015; Lu et al. 2016).

Clustering (hierarchical and non-hierarchical) methods have also been widely used for this purpose. Da Silva et al. (2017) used the Ward method to classify reference ET for the Amazon region, while Ramos et al. (2008) used K-means and the Ward algorithm for ET clustering in the Sonora River Basin in Mexico. These are examples of the geostatistics, interpolation, and clustering methods used in studies conducted worldwide and in Iran to identify homogeneous ET regions.

There are many widely used methods for estimating hydrological and climate parameters for ungauged stations, but all are usually associated with errors. The recently presented region of influence (RoI) method is the latest regionalization method for solving the problems with conventional methods and is reported to produce accurate and reliable estimates with fewer errors (Wiltshire 1986). The RoI approach was first used as an alternative method for transfer of practical information from nearby weather stations to estimate flow rate at a target station (Wiltshire 1986; Acreman and Wiltshire 1987). In this approach, each station is allowed to have a unique region that creates an area for an ungauged station, making it superior to conventional regionalization methods (Burn 1990a). Zrinji and Burn (1994) confirmed this conclusion for ungauged stations in Canada. A great variety of RoI-based applications have since been used in flood estimation (Burn 1990a, 1990b; Zrinji and Burn 1994; Tasker et al. 1996; Castellarin et al. 2001; Holmes et al. 2002; Merz and Blöschl 2005; Chiu et al. 2005; Eng et al. 2007; Tsang et al. 2011). RoI-based applications have also been used in estimation of extreme rainfall (Gaál et al. 2008a; Gaál and Kyselý 2009; Bharath and Srinivas 2015; Dehghan et al. 2018a; Dehghan et al. 2018b) and in regionalization of low flow (Holmes et al. 2002). The results indicate that the RoI approach is preferable to other methods. However, no previous study has estimated ET using this method.

The first step in regionalization is to identify and collect information that can be used to calculate the proximity and similarity between several weather stations in the desired area, which are defined as attributes. Attributes used previously in investigation using the RoI approach include predictor and geographical variables (Chiu et al. 2005), climatological and geographical characteristics (Gaál et al. 2008b), geological and physical variables (Samuel et al. 2011), and climate, geographical, and statistical attributes with their hybrids (Dehghan et al. 2018b). Merz and Blöschl (2005) and Eng et al. (2007) obtained their best estimates when they considered both predictor variables and geographical proximity. Dehghan et al. (2018b) concluded that statistical attributes, in combination with climate and geographical characteristics, gave the best estimates of quantiles in terms of low relative error, and that skewness can play a useful role in evaluation of quantiles.

The next step in regionalization is to use a tool to determine similarity between stations. In the metric space, this similarity is defined by the distance criterion. Researchers have employed different distance criteria, but the Euclidean distance metric has been used in most studies (Burn 1990a, 1990b; Holmes et al. 2002; Eslamian 2010a, 2010b; Dehghan et al. 2018a, 2018b). By applying appropriate weights for available attributes in regions without data, acceptable and reliable results can be obtained for ungauged stations (Dehghan et al. 2018a).

In previous studies on regionalization, various attributes have been found to affect the goal, depending on conditions in the target region, indicating that all attributes should not be allocated the same degree of importance. There are several techniques for determining the weight of different attributes in multiple-attribute decision-making (MADM) problems, one of which is the Shannon entropy method. Shannon (1948) introduced the concept of information entropy, defined as a measure of the degree of turbulence within a system, which can have a significant effect on the identification of practical elements and their impact. The Shannon entropy concept has been widely used in hydrology (Singh 2011).

The entropy method has been used recently for a range of purposes, including determining the significance of rain gauge stations in spatiotemporal scaling (Wei et al. 2014), field velocity distribution during flood events (Chiu and Tung 2002; Moramarco et al. 2004; Farina et al. 2014), rainfall-runoff modeling (Jowitt 1991), averaged rate of infiltration (Singh 2010a), soil moisture (Al-Hamdan and Cruise 2009; Singh 2010b), distribution of piezometric head in groundwater flow (Barbe et al. 1994), estimation of discharge (Moramarco and Singh 2001; Chiu et al. 2005), and flow and sediment concentrations (Chiu et al. 2000).

Given the strategic location of Lake Urmia in north-west Iran and the fact that it is the largest hypersaline lake in the Middle East, many studies have focused on ULB (Fazel et al. 2017; Dehghan et al. 2018a, 2018b; Haghighi et al. 2018; Akbari et al. 2019). Various studies around the world have analyzed the spatial distribution of ET, but these studies have limitations as they only explain geographical attributes, regardless of their weight in clustering. To extend the analysis, we assessed the applicability of the RoI approach according to the degree of importance and participation of each attribute in regionalization of ETp in ULB. To obtain more accurate results in regionalization, we developed a framework for appropriate weighting in regionalization of ETp. In our novel approach, weighting is based not only on geographical attributes, but also on climatological and statistical attributes. We evaluated and compared the performance of the weighted attributes using both clustering and the RoI approach.

2 Materials and methods

2.1 Study area

The analysis was based on long-term weather data for Urmia Lake Basin in north-west Iran, which lies between 35° 41′–38° 30′ N and 44° 13′–47° 53′ E, and is 140 km long and 40–55 km wide. The basin covers a total area of 52,000 km2, which is approximately 3% of the total area of the entire country. Around 65% of the catchment area of Lake Urmia consists of mountainous regions, 24% of plains and foothills, and 10% is occupied by the lake itself. The basin is surrounded by the northern part of the Zagros Mountains, the southern slopes of the Sabalan Mountains, and the northern, western, and southern hills of Mount Sahand. Lake Urmia, with a maximum depth of 16 m, is classified as a shallow lake, which increases its vulnerability to evaporation. The annual evaporation rate from the lake surface is estimated to be between 0.98 and 1.2 m, reflecting the dry climate in ULB. For the present analysis, daily weather data from 30 stations (see Table 1) were obtained from the Meteorological Organization and Water Resources Management Company of Iran. The historical data covered 20 years, 1997–2016. Figure 1 shows the spatial distribution of selected stations.

Table 1 The geographical location of the selected stations in Urmia Lake Basin
Fig. 1
figure 1

The geographical location of Urmia Lake Basin and the selected meteorological stations

2.2 Determination of potential evapotranspiration

Potential evapotranspiration can be computed from meteorological data. Numerous studies around the world have found that the adapted FAO Penman-Monteith (FAO-56 PM) model (Eq. 1) is the most accurate method for estimating ETp. This method is widely used and recommended as the standard method for determining ETp from meteorological data (Allen et al. 1998).

$$ {\mathrm{ET}}_{\mathrm{P}}=\frac{0.408\Delta \ \left(\mathrm{Rn}-G\right)+\upgamma \left(\frac{900}{T+273}\right){u}_2\left({e}_{\mathrm{s}}-{e}_{\mathrm{a}}\right)}{\Delta +\upgamma \left(1+0.34\ {u}_2\right)} $$
(1)

where ETP is the potential crop evapotranspiration (mm day−1), Δ is the slope of the saturation vapor pressure function (kPa (°C)−1), Rn is the net radiation (MJ m−2 day−1), G is the soil heat flux density (MJ m−2 day−1), γ is a psychometric constant (kPa (°C)−1), T is the mean temperature (°C), u2 is the wind speed at 2 m height (m s−1), es is the saturation vapor pressure (kPa), ea is the actual vapor pressure (kPa), and esea is the saturation vapor pressure deficit (kPa). The factor 0.408 = 1/λ (λ = latent heat of vaporization in MJ kg−1) converts units from MJ m−2 day−1 to mm day−1. In this study, all parameters necessary for computing potential evapotranspiration with the FAO-56 PM method were calculated according to the procedure developed by Allen et al. (1998).

2.3 Selection of attributes

The information used to calculate the similarity between different weather stations in the study area was divided into different attributes. The types of attributes used in the regionalization method play a key role in the success of further regionalization steps. A wide range of statistical, climatological, and geographical information was used in this study to effectively transfer data from the basin stations to reference stations. The geographical proximity of stations is considered to be a suitable indicator for similarity values of evapotranspiration. However, simple geo-proximity between the two points cannot be interpreted as similarity of stations. As the value of the attributes increases, the probability of creating dependent variables also increases. To increase the precision, climatological and statistical attributes were also considered in this study. Among the geographical attributes, longitude (x), latitude (y), and height above sea level (h) were selected. The set of climatological site attributes consisted of average daily wind speed (WS), average daily relative humidity (RH), and average daily temperature (T). Coefficient of variation (CV), coefficient of skewness (CS), and the ratio of CV to CS were selected as the statistical attributes.

2.4 Weighting approach

Weighting coefficients represented the relative importance of the attributes for each of the selected stations. Since all stations in the RoI of the reference station are not in equal proximity, a weighting function was needed to reflect the relative importance of each station for estimation of ETp at the reference station. The Shannon entropy was used to calculate the weight of the different geographical, climate, and statistical attributes.

2.4.1 Shannon entropy

Entropy refers to a small amount of disturbance of the thermodynamic system, and was used by Shannon (1948) to describe uncertainty in information sources. In information theory, entropy is specified as the amount of irregularity in a system. Therefore, measured entropy can be used to estimate the heterogeneity of the attributes required in ETp estimation. The more dispersion in the amount of entropy in an attribute, the more critical it will be. The process of calculating the Shannon entropy can be expressed in a series of steps (Shannon 1948):

  • SE1: Normalize the decision matrix:

$$ {f}_{ij}=\frac{x_{ij}}{\sum_{j=1}^m{x}_{ij}},\kern0.75em \left(\ j=1,\dots, m,\kern0.5em i=1,\dots, n\right) $$
(2)
  • SE2: Compute entropy:

$$ {E}_i=-k{\sum}_{j=1}^m{f}_{ij} Ln{f}_{ij} $$
(3)
$$ k=\frac{1}{Ln\ n} $$
(4)

where xij is the rating of station i concerning attribute j, fij is the normalized xij, m is the number of attributes, n is the number of stations, Ei is the amount of dispersion or entropy in attribute I, and k is the entropy constant.

  • SE3: Determine uncertainty:

$$ {d}_i=1-{E}_i $$
(5)

where di represents the uncertainty or degree of deviation of the data for attribute i.

  • SE4: Determine the significance of attribute i:

$$ {\hat{W}}_j=\frac{d_i}{\sum_{i=1}^m{d}_i} $$
(6)

where \( {\hat{W}}_j \) denotes the attribute j weight vector.

2.5 Distance metric

In metric space, the similarity is defined by the distance criterion. If the attributes of the catchment area are the same, the measurement distance is zero. As the difference in attributes increases, the measurement distance will increase. Several methods have been proposed for determining the distance metric to express similarity, including Manhattan, Canberra, and Minkowski. In the RoI procedure, Euclidean distance is most widely used in regionalization methods. Euclidean distance is the straight line between two stations and is defined as:

$$ {D}_{ij}={\left({\sum}_{m=1}^M{W}_m{\left({X}_m^i-{X}_m^j\right)}^2\right)}^{1/2} $$
(7)

where Dij is the weighted Euclidean distance between stations i and j, Wm refers to the weight values of the mth attribute for the reference station that satisfy Wm ≥ 0 and\( {\sum}_{i=1}^m{W}_i=1 \), \( {X}_m^i \) and \( {X}_m^j \) are the value of the mth attribute at stations i and j, and M is the number of attributes. The distance metric matrix D is symmetrical (Dij = Dji) with zero values on its main diagonal (Dii = 0). Since the selected attributes may have different units, it is necessary to convert the initial data before computing Dij. The most straightforward alternative is to standardize variables. In Eq. (7), Xm and Ym are the standardized values of attributes for the reference stations.

2.6 Definition of threshold

After selecting the appropriate attributes and calculating the distance metric matrix, the first step in the RoI approach was to select a threshold value or cutoff point for the reference stations. In determining the threshold of the metric distance of the ith station, only the stations with metric distance below the threshold value will fall within the RoI of the reference station i:

$$ \mathrm{RoI}=\left\{j:{D}_{ij}\le {\theta}_i\right\} $$
(8)

where RoI is a set of stations i in the region of influence and θi is the threshold value for station i (Burn 1990b).

Burn (1990b) presented a general framework for determining the threshold distance θi considering the weight of the attributes ηij in three different options (#1–#3).

2.6.1 Option #1

In option #1, the RoI for the reference stations contains a limited number of stations, and all selected stations are assigned a weight within the range 0–1, expressed as follows:

$$ {\theta}_i={\theta}_{\mathrm{L}}\kern1em if\kern1em \mathrm{NS}i\ge \mathrm{NST}, $$
(9)

and

$$ {\theta}_i={\theta}_{\mathrm{L}}+\left({\theta}_{\mathrm{U}}-{\theta}_{\mathrm{L}}\right)\left(\frac{\mathrm{NST}-\mathrm{NSi}}{\mathrm{NST}}\right) if\ \mathrm{NS}i<\mathrm{NST} $$
(10)

where θL and θU are the lower and upper threshold values for station i (25th and 75th percentile of Euclidean distance), respectively, NST is the number of stations that can be nearby in RoIi, and NSi is the number of stations in the RoI of the reference station. The weighting function for option #1 is:

$$ {\eta}_{ij}=1-{\left(\frac{D_{ij}}{\mathrm{TP}}\right)}^n $$
(11)

where ηij is the weight of station j in the RoIi, TP is the 85th percentile of the Euclidean distance for option #1, and n = 2.5.

2.6.2 Option #2

In option #2, a large number of stations are in the RoI of the reference station, and lower weights are allocated to stations with less similarity. In this case, the threshold value is considered:

$$ {\theta}_i={\theta}_{\mathrm{U}}\kern0.5em $$
(12)

The weighting function for option #2 is defined as:

$$ {\eta}_{ij}=1\kern1em if\kern1.25em {D}_{ij}\le {\theta}_{\mathrm{L}} $$
(13)

and

$$ {\eta}_{ij}=1-{\left(\frac{D_{ij}-{\theta}_{\mathrm{L}}}{\mathrm{TN}-{\theta}_{\mathrm{L}}}\right)}^n\kern0.75em if\kern1.25em {\theta}_{\mathrm{L}}<{D}_{ij}\le {\theta}_{\mathrm{U}} $$
(14)

In option #2, in addition to θL and θU as a weighting function, there are two other parameters (TN and n). TN is calculated using TPP as:

$$ \mathrm{TN}=\max \left[\max \Big({}_{\left\{j\right\}}{D}_{ij}\right),\mathrm{TPP}\Big] $$
(15)

In this case, θL, θU, and TPP are considered the 25th, 75th, and 85th percentiles of Euclidean distance and n = 0.1 (Burn 1990b).

2.6.3 Option #3

Option #3 is almost the same as option #2, except that all stations in the RoI of reference stations have an appropriate value of the weighting function:

$$ {\theta}_i=\underset{\left\{j\right\}}{\max}\left({D}_{ij}\right) $$
(16)

The weighting function for option #3 is the same as for option #2.

2.7 Clustering method

One of the agglomerative clustering methods used in this study was the Ward method. The Ward algorithm acts to minimize the internal variance of the whole cluster, by aiming to find spherical and dense clusters. It is defined thus:

$$ W={\sum}_{K=1}^K{\sum}_{j=1}^m{\sum}_{i=1}^{N_k}{\left({f}_{ij}^k-{f}_{\bullet j}^k\right)}^2 $$
(17)
$$ {f}_{\bullet j}^k=\frac{\sum_{i=1}^{N_k}{f}_{ij}^k}{N_k} $$
(18)

where W represents the total within-group sum of squares, k is the number of clusters, m is the number of attributes, Nk is the number of an attribute in stations of each cluster, \( {f}_{ij}^k \) is the normalized value of a jth attribute in the ith station belonging to cluster k, and \( {f}_{\bullet j}^k \) denotes the mean value of a jth attribute for cluster k.

2.8 Regional homogeneity

The identification of homogeneous regions leads to more accurate data transfer. Homogeneous areas include stations that are in the same group. The stations within a group have similar characteristics and, in the formation and integration of a group, all stations with similar characteristics are involved. In this regard, Hosking and Wallis (1993) evaluated several quantifications and developed the heterogeneity (H) and discordancy (Di) measures.

The heterogeneity test is recommended to identify homogeneous regions created by regionalization. If H < 1, an area is considered similar; for 1 < H < 2, a region is considered relatively heterogeneous; and for H > 2, a region is deemed to be heterogeneous (Hosking and Wallis 1993). The heterogeneity test contains H1, H2, and H3 statistics, which are dependent on the L-moment distribution of linear variation coefficient (LCV), linear skewness coefficient (LCS), and linear kurtosis coefficient (LCK). Husking and Wallis found that H2 and H3 could not differentiate between homogeneous and heterogeneous regions and concluded that H1 based on LCV had the highest potential for differentiation. Therefore, H1 is recommended as a primary index for heterogeneity and is more appropriate for this test. It is calculated as:

$$ H=\frac{V-{\mu}_V}{\sigma_V} $$
(19)

where V is the weighted variance of LCV for the studied region, μV is the mean of V, and σV is the standard deviation of V.

The test of discordancy specifies uncoordinated stations compared with the entire group in terms of the L-moment ratios. The amount of critical value for Di (Hosking and Wallis 1997) is shown in Table 2. Stations with Di higher than a threshold are discordant, and removing or moving discordant stations will make all regions homogeneous in the study area. The discordancy statistic is calculated as:

$$ {D}_i=\frac{1}{3}N{\left({\hat{u}}_i-\overline{u}\right)}^T{A}^{-1}\left({\hat{u}}_l-\overline{u}\right) $$
(20)
$$ A={\left(N-1\right)}^{-1}\ {\sum}_{i=1}^N\left({\hat{u}}_i-\overline{u}\right){\left({\hat{u}}_i-\overline{u}\right)}^T $$
(21)

where Di is the discordancy measure for station i, N is the number of stations in the region, \( {\hat{u}}_i \) is a vector containing LCV, LCS, and LCK for the station, \( \overline{u} \) is the regional average for \( {\hat{u}}_i \), and A is the matrix of covariance of the sample.

Table 2 Critical values for the discordancy statistic (Di)

In regional frequency analysis, the appropriate regional distribution is considered the best fit for the stations in a homogeneous region. Therefore, the scoring method can be used to select the best regional distribution. The most commonly used goodness-of-fit methods in previous studies are the chi-square test, Kolmogorov-Smirnov test, and calculation of residual squares. The best-fit distribution can be obtained for homogeneous regions using the values of ZDist defined by Hosking and Wallis (1997):

$$ {Z}^{Dist}=\frac{\left({\tau}_4^{Dist}-{\tau}_4^R+{B}_4\right)}{\sigma_4} $$
(22)

where \( {\tau}_4^R \) is an average L-kurtosis value of the region, \( {\tau}_4^{Dist} \) is a theoretical L-kurtosis value computed from the simulation for a fitted distribution, and B4 and σ4 are the bias and standard deviation, respectively, of L-kurtosis values obtained from simulated data. The fitting result of the distribution is considered satisfactory if |ZDist| ≤ 1.64. When more than one distribution qualifies for the goodness-of-fit measure, the preferred distribution is that with the lowest value (closest to zero).

3 Results and discussion

3.1 Weighting method for the defined attributes

The weight value of each attribute determines the impact that attribute will have on the desired category in determining homogeneous regions, in the present case in ULB. Analysis of the influence of climate, statistical, and geographical parameters on ETp was performed using the Shannon entropy method, and the weight of each parameter was obtained. The weights assigned to attributes in each of the categories are summarized in Table 3. Among the attributes, by far the highest weight was given to attributes belonging to the statistical group. These made up almost 73.98% of the total weight, and thus had a high degree of importance in regionalization of the basin. The climate attributes were the second most important, with 20.55% of the total weight, and finally the geographical attributes, with 5.48%.

Table 3 The weight assigned to each attribute in each of the categories.

Differences between the weights defined for each attribute within groups were observed. The range of weight changes (between the highest and lowest assigned weights) was the greatest in the statistical attribute group, indicating differences in the degree of importance of attributes in this group. The skewness of potential evapotranspiration (CS) had the highest weight (41.6%) and was thus identified as the most influential attribute. The next most important attributes were CV/CS and WS, with 23.98% and 12.83% of the total weight, respectively; i.e., they also contributed strongly to regionalization.

According to the weighting results, attributes latitude and longitude (x and y), belonging to the geographical group, had the least impact, i.e., had the lowest weight (0.06%). The remaining attributes had equal influence in the reference ETp regionalization of ULB.

3.2 Weighting impact on clustering

The Ward clustering method was used to identify homogeneous regions in ULB. The hierarchical clustering algorithm in the Ward method was used to minimize the internal variance between categories. As the number of stations in each cluster decreases with increasing similarity, precise estimation of the similarity and the optimal number of clusters is required. Validation of clusters to find the optimal number of clusters was performed using the R software. The model took into account the most frequent number of clusters among 30 indicators shown in Fig. 2.

Fig. 2
figure 2

The optimal number of clusters. a No weighting. b By applying weights to attributes

The results showed that there was no change in the number of clusters by attribute weighting, and three main clusters were created in each case for the study area. Silhouette coefficient results were used as a cluster validation index to choose the best set of clusters. The average silhouette width (ASW) is within the range − 1 to + 1, and the method with the highest ASW is optimal. In the present case, the values of this coefficient for the non-weighted and weighted clusters were 0.36 and 0.41, respectively. This indicates that attribute weighting was able to cluster the stations in ULB better than no weighting. The ASW value decreased in both cases with an increasing number of clusters, and the best clustering results were obtained at k = 3.

It was found that increasing or decreasing the number of attributes studied did not necessarily lead to a rise in the number of clusters. In other words, increasing or decreasing the attributes for regionalization cannot increase or decrease the number of homogeneous regions in that area (Dehghan et al. 2018b).

Similarity values, obtained from Euclidean distance, affected the number of the stations in each group in clustering. Figure 3 illustrates the spatial pattern of the three homogeneous regions of ETp identified in ULB. Figure 3a shows the clustering of ETp without applying the weight, while Fig. 3b illustrates the clustering on applying weights to three categories of attributes. On comparing (a) and (b), it can be seen that clustering of the basin changed after the weighting of attributes, and that some stations were located in a different region in ULB. Thus, it can be concluded that geographical proximity is not a guarantee of similarity between stations, which is in agreement with Da Silva et al. (2017).

Fig. 3
figure 3

Changes in the spatial distribution of ETp with the Ward clustering method in ULB. a Non-weighting. b By applying weight to attributes

3.3 Regionalization with the RoI approach

The threshold values for the three reference stations (Saqqez, Tabriz, and Urmia) were determined according to the similarity distance metric. After determining the final weight of the parameters, the weighted Euclidean distance from the reference station was calculated for all stations. Considering that an increase in the metric distance between stations indicates a decrease in similarity between stations, using weighted attributes can have a positive effect on determining the metric distance between stations to enhance their similarity. The results showed that each of the three reference stations had a different threshold value than other stations, which is quite logical.

Figure 4 shows the position of the stations located in the RoI of the reference stations against the weight of each station. As can be seen, the highest weights were assigned to stations with different distances, and some stations with different weights were near to each other. Therefore, stations closer to the reference station did not have higher weights than more distant stations.

Fig. 4
figure 4

Relationship between the weight of each station (ηij) and its distance (km) from the reference stations: Saqqez (a), Tabriz (b), and Urmia (c) in the RoI method

As can be seen in Fig. 4, the stations with high weight in the RoI were located at a distance of less than 150 km from Tabriz station, about 200 km away from Saqqez station and 100 km or less from Urmia station. In general, within a distance of approximately 0 to 200 km from the reference station, weights of 0.66 to 0.99, 0.75 to 0.99, and 0.51 to 0.97 were allocated to the stations in the RoI of Saqqez, Tabriz, and Urmia stations, respectively. Thus, distance from, or proximity to, the reference station was not the most critical factor affecting the allocated weight. Closer stations to the reference station were mostly assigned higher weights, but some stations at greater distance from the reference station also had high weights. These results of the RoI approach are in agreement with previous findings (Eslamian 2010a, 2010b). Estimation of hydrological parameters in ungauged stations or station with incomplete data requires more accurate and reliable methods, such as the RoI approach. To our knowledge, this is the first study ever to estimate ETp with the RoI approach, although it has been used in flood frequency analysis (Burn 1990a, 1990b), flood regionalization (Eng et al. 2007), and precipitation frequency analysis (Gaál et al. 2008a, 2008b; Gaál and Kyselý 2009; Dehghan et al. 2018a, 2018b).

3.4 Allocation of homogeneous regions

The homogeneity index was evaluated using the Monte Carlo simulation with 1000 replications for each of the areas. After calculating L-moments of LCV, LCS, and LCK at each station, the discordancy (Di) and heterogeneity (H) statistics were calculated for stations located in each area.

Stations with a high amount of Di were removed from the set of stations to determine the homogeneous regions according to the Di amount based on Table 2. The values of the H-statistic for homogeneous areas are shown in Table 4. Based on these values, in both cases (before and after weighting) no station was deleted in the first and second clusters, but one and three stations were detected in the third cluster before and after weighting, respectively, and were excluded from the calculations. As can be seen in Table 4, in the RoI approach, one and two stations were removed from RoI2 and RoI3, respectively, of both Urmia and Saqqez stations. One station from RoI1 and two stations from RoI3 of Tabriz station were removed.

Table 4 Values of the H-statistic of the heterogeneity test in the clustering and RoI method.

After removing discordant stations, a heterogeneity test was conducted for the remaining stations in each region. According to the results of the clustering method in both cases (non-weighted and weighted based on the H1 measure), cluster 1 and cluster 2 can be considered homogeneous regions. However, in cluster 3, H1 exceeded the critical value of 1 representing a relatively heterogeneous region. The values of H1 in clusters with attribute weighting were less than those in non-weighting, which indicates that the homogeneity of clusters was increased by attribute weighting.

In the RoI approach, any increase in the threshold value leads to an increase in the number of stations in the RoI of the reference station. Therefore, the number of stations was the highest in RoI3, with the highest threshold values. The lowest number of stations (9–10 stations) in the RoI of the three reference stations was observed in option #1. In options #2 and #3, 22–24 and 30 stations, respectively, were considered in the RoI of the reference stations. Thus, increasing the threshold, and thereby the number of stations in the RoI of the reference station, led to an increase in heterogeneity in the region. For the Urmia and Saqqez reference stations, the best homogeneity was observed in the area with the option #1 threshold. For Tabriz station, the homogeneity was greater with the option #2 threshold than with the option #1 threshold.

After analyzing the homogeneity of the study regions, the best-fitted distribution of these regions was determined. For this purpose, the ZDist value for each area, including generalized logistic (GLOG), generalized extreme-value (GEV), generalized normal (LOGN), Pearson type III (P-III), and generalized Pareto (GPA), were computed (Table 5). To avoid multiple distribution functions in the estimates obtained in hydrological studies, a type of distribution function should be used for all study regions. Here, GEV, GLOG, and LOGN were determined as best-fitted distributions by the RoI approach, and GEV and GLOG by clustering in ULB. Hence, the distribution function GEV was identified as the best distribution and can be considered the selected function in all regions with both of these methods.

Table 5 Values of the ZDIST statistic of the goodness-of-fit test for considered probability distribution functions.

Root mean square error (RMSE) was used to estimate the error between simulated values and observations. In the clustering method, regions with weighted attributes had lower RMSE in comparison with non-weighted, and the highest estimated error occurred in the group with the highest number of clusters (Table 5). Based on the results obtained using the RoI approach, option #1 for the threshold showed the best performance (in terms of RMSE). Better results in terms of RMSE were obtained with the RoI method than with clustering. Unlike in the clustering method, in the RoI approach, the error values in the groups are not wide-ranging about each other and vary within a relatively low range.

4 Conclusions

In this study, regionalization of potential ETp with an integrated spatial pattern based on clustering and on the RoI approach was applied to ULB. Due to the importance of selecting attributes in regionalization, nine attributes in three groups (statistical, climate, geographical) affecting ETp were studied and weighted using the Shannon entropy method. The results showed that different attributes were allocated different weights, reflecting differences in their degree of importance. The most significant impact of weighting was found to be assigned to statistical attributes, among which skewness coefficient was identified as the most critical attribute. Thus, it can be concluded that outliers should be given special attention, as they increase the skewness coefficient.

The clustering analysis revealed differences in the clusters formed on taking into account the attributes of the study area compared with considering the conditions regardless of the attributes. Urmia Lake Basin was divided into three homogeneous regions based on cluster analysis of the study region and homogeneity tests. The optimal number of clusters was identified based on the most frequent number of clusters among 30 indicators. Average silhouette coefficient (ASW) results indicated a better performance in Ward clustering of the model with weighted attributes, in comparison with the non-weighted model. Performing a heterogeneity test and removing discordant stations increased the value of H1, and the amount of H1 in weighted attributes was better than in the non-weighted option. It can be said that attribute weighting improved homogeneity compared with when no weight was assigned to attributes of ULB.

The highest RMSE values were observed in groups with high H-statistics. Option #1 of the threshold gave the best performance (in terms of RMSE). It can be concluded that weighting of attributes in regionalization has a significant impact in obtaining accurate and reliable quantiles. One of the most important reasons for the superior performance of the RoI approach compared with clustering was the weighting of the stations, which had a significant effect in lowering the error in the estimates. Weighting the stations also reduced the role of nearby stations with low similarity to the target station. Due to coordination of the target station with other stations, the RoI method provides more accurate estimates than other regionalization methods and is a highly flexible method for transmitting information from nearby stations to target stations.

In general, the results show that the RoI is a powerful approach that rationally involves a large number of stations in the proximity of reference station, with the weight assigned to each station reflecting the lack of similarity between them. In other regionalization methods, the stations have equal weight and their relative role in the regionalization is not determined, which is one of the strengths of the RoI approach.