1 Introduction

The water demand has dramatically increased due to the exponential rise in human population over the last few decades. According to Food and Agriculture Organization (FAO), the world’s need for water is growing twice as fast as the population (FAO 2015). The existing water resources of a country are under the growing pressure of population, rapid urbanization, industrial management and irrigated agriculture which leads to the water scarcity and food security issues. By 2025, about 1.8 billion people are expected to experience water scarcity, while two-thirds of the population will experience water stress (United Nations 2014).

The managers and stakeholders in the field of water resources management are facing complex challenges due to diverse situations including climate change ecosystems, water quality issues, increasing water supply demand in urban areas and challenges to watersheds management in an integrated, adaptive, and collaborative manner. Arid and semiarid regions (ASARs) are constantly affected by water scarcity and soil moisture deficit problems. The farmers are facing major challenges due to spatio-temporal rainfall distribution and unpredictable rainfall patterns leads to climatic uncertainty and aridity situations. These problems affect the crop production and increasing food risk. A sustainable water resources management in the agriculture sector is vital for food crisis and acts as a catalyst for socioeconomic development for the country. Rainwater harvesting (RWH) a renewed attention since 1980, proved as a rational approach to combat with water scarcity issues for thousands of years. It is a promising alternative source of supplementing water, to address the water shortage issues in ASARs. RWH can be defined as a system for collecting, storing, conserving rainwater runoff for agriculture in ASARs. RWH proved extremely valuable, especially in dry areas, where it simultaneously addresses the water scarcity problem and reduce groundwater extraction, cropping risk and increase crop yield. Additionally, improve pasture growth, boost reforestation, increase food production, fight soil erosion, and enhance the utilization of water resources. RWH recharge local groundwater reserves and not only increases water availability, also enhances employment opportunities, improve socio-economic conditions in ASARs (Ammar et al. 2016).

Water harvesting is typically used to resolve water scarcity problems, supplementing available water resources. As defined by World Overview of Conservation Approaches and Technologies (WOCAT) database, RWH is “the collection and management of rainwater runoff to increase water availability for domestic and agricultural use as well as ecosystem sustenance” (Mekdaschi and Liniger 2013). However, identifying potential sites for RWH is an important step toward maximizing water availability and land productivity in semi-arid areas (Isioye et al. 2012). The suitability of RWH sites depends upon several criteria (Mahmoud and Alazba 2014) and a number of techniques and/or methodologies are available for suitable sites selection (Ahmad 2013; Al-Adamat 2008; de Winnaar et al. 2007). Also, the best site selection must consider the socio-economic and physical characteristics of a target area (Al-Adamat et al. 2010). Consequently, to enhance the water availability through harvesting water is extremely important to analyze the main factors that affect the decision-making of sitting the harvesting technique as well as the best size according to the surrounding of the target area.

The identification of suitable sites and technical design are the two key factors behind the success story of RWH systems (Al-Adamat et al. 2012). The integration of selected criteria into a site suitable RWH tool can be done by variety of methods. Two groups of criteria (biophysical and socio-economic) were used in various research studies in ASARs of the world. One group of studies focus on biophysical criteria such as rainfall, drainage system, slope, land use land cover, soil type (Kadam et al. 2012; Kumar et al. 2008), and the other group focus on integrate socio-economic parameters (land tenure, distance to settlement/streams/roads/agricultural area, population density, related cost) with the biophysical components (FAO 2003; Kahinda et al. 2008; Ziadat et al. 2012; Bulcock and Jewitt 2013; Krois and Schulte 2014).

The methods and tools that have been applied to identify the suitable sites using biophysical and socio-economic criteria such as; GIS and Remote Sensing (GIS/RS) (Bamne et al. 2014; Al-Shamiri and Ziadat 2012), hydrological modelling (HM) with GIS/RS (Mahmoud and Alazba 2014), multi-criteria analysis (MCA) integrated with HM and GIS/RS (Krois and Schulte 2014; Khan and Khattak 2012; Elewa et al. 2012; Weerasinghe et al. 2011; Sekar and Randhir 2007) and MCA integrated with a GIS (Sayl et al. 2017; Kahinda et al. 2009; Jothiprakash and Sathe 2009; Pauw et al. 2008; Ahmed Ould Cherif et al. 2007; Mbilinyi et al. 2007). All methods and tools used in previous research studies related to site selection for RWH have some limitations but GIS/RS tool is a first step application tool for suitable sites identification while for more accurate results and data rich regions, the integration of MCA and GIS-based HM are highly recommending methods and/or tools. MCA (AHP) along with GIS offers a high potential for RWH site selection in data poor region (Ammar et al. 2016).

This study mainly focused on the optimum site selection for RWH in Guatemala, where the majority of the population depends on the agricultural sector, which represents around 10 % of GDP. Furthermore, expert opinions, the physical and socio-economic characteristics were taken into account, and the optimal sites were identified in a Geographic Information System (GIS) with Analytical Hierarchy Process (AHP) for the study area. Actually the study watershed has large area with limited data, so site selection process may become complex while considering all these factors. Also, GIS and remote sensing (RS) have the ability to ease the process of site selection for RWH (Ziadat et al. 2012; Bulcock and Jewitt 2013). Sayl et al. 2017 used GIS-RS with MC decision techniques such as hydrology, topography, socio-economic and environment to identify the suitable RWH sites for data scarce areas.

2 Study Area

The study area comprises the cities of Chiquimula, San José la Arada, San Jacinto, and Ipala, having a total geographical area of 770.61 km2, located in northeastern Guatemala (Fig. 1). The topographical elevation ranges about 272AMSL to 1826AMSL (the Ipala volcano). The average, maximum, and minimum temperature are 25 °C, 31 °C, and 20 °C, respectively and study area showing a strong correlation exists between temperature and topography having a lapse rate 9 °C/100 m (MAGA 2005). The rainfall distribution has low spatial variability and varies from 742 to 1500 mm (average annual normal rainfall = 1166 mm). During the year, there are two wet peaks in July and September, given by displacement of the Inter-Tropical Convergence Zone (ITCZ). The study area has semi-arid climate and soil moisture deficit is a major problem due to evapotranspiration (1400 mm in high elevation areas and 2000 mm in low laying zone), which has a high relation with the spatial and seasonal variability of the temperature along the year (MAGA 2005). Generally, the agricultural area covers 23% and has coffee, rice, avocado, tomatoes, cucumbers, and peppers crop. Other crops cultivated with traditional farming techniques are beans (Phaseolus vulgaris), maize (Zea mays), and sorghum (Sorghum vulgare) in monoculture and minimum amounts of peanuts (Arachis hypogeal). Regularly, flat topography crops are grown under irrigation in the dry season. Typically, agriculture land has 4%–8%, 8%–16%, slopes. The physical map of study area along with topographic, land use, slope, soil texture and agriculture type maps are shown in Fig. 1.

Fig. 1
figure 1

Physical maps of study area, slope, land use, soil texture and type of agriculture. Source: Map of Guatemala Republic by MAGA 2005 (modified)

3 Materials and Methods

Analytical Hierarchy Process (AHP) is a multi-criteria analysis, selected as the most viable decision method to identify suitable sites for RWH with GIS platform. A wide use of AHP has been done in many research studies for the identification of potential RWH sites (Krois and Schulte 2014). In AHP technique the complex decisions are organized and analyzed in a structured way based upon knowledge of mathematic and experts. The main code of AHP is to symbolize the components of any problem hierarchically to display the relationships between each level. The main goal (objective) should be on the uppermost level for resolving a problem, and the lower level consist of more detailed criteria that influence the main objective. Generally, in AHP method a matrix of pairwise comparisons is applied to determine the weights for each criterion. The suitability assessment of a given objective which involve two criteria are determined by their relative importance with the help of pairwise comparisons. The 9-point continuous scale is used to compare and rate the two criteria. The odd values 1, 3, 5, 7, and 9 correspond respectively to equally, moderately, strongly, very strongly, and extremely important criteria when compared to each other, and the even values 2, 4, 6, and 8 are intermediate values.

The methodology adopted for developing such system has several steps: (i) Criteria selection; (ii) Data acquisition and classification of criteria; (iii) Development of AHP; (iv) Calculation of relative weights; (v) Sensitivity analysis (vi) Estimating the amount of water.

3.1 Criteria Selection

The selected criteria describe the factors, which define the best site for harvesting rainwater. In 1990s several studies primarily focused on biophysical criteria including drainage network, soil type, slope, rainfall, and land use. But after 2000, the trend for suitable site selection for RWH changed into integration of socio-economic with biophysical criteria (Ammar et al. 2016). In 2003, FAO used six main criteria (socio-economics, topography, hydrology, agronomy, soils, and climate) for identifying RWH sites (Kahinda et al. 2008). In this study six factors; slope, land cover, soil texture, distance from roads, distances from agricultural lands, and potential runoff were selected based on literature review and research of different scientist as shown in Table 1. These biophysical and socio-economic criteria were selected from those research studies which adopted multi-criteria analysis integrated with GIS. The criteria for selection in this study are based on the generality in the references and the data availability in this region.

Table 1 Biophysical and socio-economic RWH selection criteria in different references

3.2 Data Acquisition and Classification of Criteria

The spatial and daily meteorological data were collected from Ministry of Agriculture, Livestock and Food (MAGA) and Institute of Seismology, Volcanology, Meteorology and Hydrology (INSIVUMEH) as shown in Table 2.

Table 2 Data acquisition

The collected spatial data layers were further processed using Arc-GIS Spatial Analysis Tool for the study area. These layers then integrated for identifying suitable sites for RWH under the criteria selection. Firstly, the six parameters suitability levels were classified as five suitability groups and assigned a scale of 1–9 according to (Saaty 1990) as shown in Table 3. The suitability maps, suitability level, and percentage area coverage of each group were determined using spatial analysis tool.

Table 3 Suitability levels of selected parameters for identifying potential RWH sites

The classification of selected criteria is also based on the literature review and research studies of different scientist, for example, the Slope “A” was classified according to a research study of (Isioye et al. 2012; Al-Adamat et al. 2010). The land cover parameter was classified based on harvested water used for agricultural purposes followed by manual of harvesting runoff (Prinz and Malik 2006). Soil texture was classified according to (Critchley et al. 1991). Distance from agricultural lands was classified as assigned classifications by (Al-Adamat et al. 2010). Distance from roads was classified as followed by (Al-Adamat et al. 2010; Al-Adamat et al. 2012). Potential Runoff calculation and classification were computed with the CN map and slope “B” using (de Winnaar et al. 2007) technique. The CN map was generated with land use map and soil texture. The CN and slope “B” are related with runoff generation, which means that high CN and steepest slopes have the highest weight of importance and produces a high proportion of surface runoff.

3.3 Development of Decision Hierarchy Structure

Two decision hierarchy structures were developed in order to estimate the location of best places for allocating a harvesting rainwater system. The Structure 1 analyzes socio-economic factor related to construction cost means that a steep slope generally has a higher construction cost than a moderate slope. Structure 2 has the same methodology, but potential places for harvesting rainwater were compared with slope “A” the gentle slope. First, the location of suitable places was determined without slope and after locating the places, then compared with slope “A”. The main difference between these structures is in the process of computation, the structure 1 consider sub criteria first for computation and eliminated steep terrain areas by considering the slope as a major criteria. While the structure 2 applied only one computation in sub-criteria. The details descriptions are given below.

3.3.1 Structure #1

Structure 1 comprised of three levels (Fig. 2). The first level is the goal, which is the suitable place for harvesting rainwater. The second level compares major decision criteria i.e. the physical and socio-economic features. The third level compares sub-criteria of major decisions criteria. Slope “A”, distance from agricultural land and distance from roads are the sub-criteria of socio-economic features while physical features include potential runoff, soil texture, and land cover. All sub-criteria have the five attributes classes called suitability groups. The hierarchical process estimates the suitable place according to derived AHP equation adapted for structure #1.

$$ optimal\ index={\sum}_{a=1}^{N_2}\left[{RW}_a\times \left({\sum}_{b=1}^{Na_3}{RW}_b\times {RW}_c\right)\right] $$
(1)

where;

RW a :

Relative Weight of level 2 “Major decision criterion”.

RW b :

Relative Weight of level 3 “Sub criteria”.

RW c :

Relative Weight of attributes.

N2 :

Total number of level 2 “Major decision criteria”.

Na 3 :

Total number of level 3 “Sub criterion” that belong to level 2 “Major criterion”

Fig. 2
figure 2

The Schematic of hierarchy structure #1

3.3.2 Structure #2

Structure 2 has four levels and the difference between these two structures is the consideration of slope (Fig. 3). In Structure 2, Slope “A” is omitted from socio-economic features. Therefore, distance from agricultural land is the most important socio-economic factor. Slope “A” is a major decision criterion for comparison with potential sites. A potential site is the computation of physical and socio-economic features without slope “A.” Thus, slope “A” and potential sites were compared using the same degree of importance. The objective in Structure 2 is to find the potential site while omitting slope type. After identifying locations, slope “A” should be satisfied to find optimal locations, which is why potential sites and slope “A” have the same degree of importance.

$$ optimal\ index={\sum}_{i=1}^{N_2}\left\{{RW}_a\times \left|{\sum}_{b=1}^{Na_3}{RW}_b\times \left({\sum}_{c=1}^{Nb_4}{RW}_c\times {RW}_d\right)\right|\right\} $$
(2)

where;

RW a :

Relative Weight of level 2 “Major decision criteria”.

RW b :

Relative Weight of level 3 “Potential Site”.

RW c :

Relative Weight of level 4 “Sub criteria”.

RW d :

Relative Weight of attributes.

N2 :

Total number of level 2 “Major decision criteria”.

Na 3 :

Total number of level 3 “Potential Site” that belong to level 2 “Major criteria”.

Nb4 :

Total numbers of level 4 “Sub criteria” that belong to level 3 “Potential Site”

Fig. 3
figure 3

The Schematic of hierarchy structure #2

3.4 Calculation of Relative Weights for Both Structures

Each level and their attribute classes were compared to each other in order to develop relative weights (RW) for all elements in the two hierarchy structures. For this purposes, each level and attribute class were assigned a weight value (score) as discussed below.

The suitability groups i.e. the attribute classes were assigned a weight of 1–9 and in the third level; potential runoff, soil texture, and land use have a weight value of 5, 3 and 1 respectively for the physical feature. These values were assigned because humans can distinguish between two similar criteria. Potential runoff is the principal element of RWH and a prime regulator of hydrological responses of a catchment followed by soil texture and land use (de Winnaar et al. 2007). For socio-economic features, slope “A”, distance from agricultural lands, and distance from roads has a weight value of 5,3 and 1 respectively. Slope “A” in different studies was considered more important than distances from agricultural lands and roads (Isioye et al. 2012; Al-Adamat et al. 2010; Kahinda et al. 2008; Mbilinyi et al. 2007). In the second level, the physical features and socio-economic features were developed for three cases; the first is when physical features are 3 times more important that socio-economic features; the second is when socio-economic features are 3 times more important than physical features, and the third is when both have the same degree of importance.

After assigning the weight values (scores) to each level and attribute class, the AHP method with pairwise comparison matrix was applied to determine the weight for each criterion. The relative weight of each was calculated using principal eigenvector procedure to produce a best-fit set of weights as given below.

The relative importance, which involves in determining the suitability for the stated objective, is calculated based in the principal eigenvector, which is computed with the square reciprocal matrix of pairwise comparisons between criteria. Although this investigation used an estimation, which is calculated as an eigenvector by calculating the weights of each column and then averaging all columns; this is then repeated for each column and weights are averaged (Eastman et al. 1995). After these calculations, the next step is to normalize the matrix, meaning that each element in the matrix has a unit value. Therefore, the sum of each column should equal 1. For example, the relative weigh for attributes is given Table 4.

Table 4 Relative weight for attributes

After obtaining every matrix of the model, the next step is to compute weighted overlay in spatial analysis tool using ArcGIS 10.1, and calculate the optimal index of preference using Eqs. (1 and 2).

3.5 Sensitivity Analysis

The sensitivity analysis was performed to detect common sites for harvesting rainwater varying the weight of importance of socioeconomic and physical features of the target area. Performing a sensitivity analysis with different weights and for different structures provides imminent signs that demonstrate the robustness of the model. In Case I, the physical features were more important than socio-economic features and were given a value of 3 times bigger to emphasize the importance. In Case II, socio-economic features were more important than physical features and given a value of 3 times bigger to emphasize it. In Case III, physical and socio-economic features have the same importance and same values were given. The relative weights of each case were calculated using eigenvalue and to assess the robustness of both structures for each case an error matrix was developed. This is the way to check how the models are presenting results. The application of error matrix is shown in Fig. 4, which is useful for analysis of the assignment of classes of an image, and detects differences by comparing data. Four error indicators were used: Producer’s accuracy; User’s accuracy; Overall accuracy; and Kappa’s accuracy.

Fig. 4
figure 4

Application of error matrix

3.6 Estimating the Amount of Water

After identification of suitable locations, the next is to estimate the amount of water that can be harvested. The Hydrological SCS-Curve Number method was used to estimate the runoff and it requires precipitation and CN (Al-Jabari et al. 2009). In this research, 22 years (1990–2011) of historical data of precipitation from two meteorological stations “Camotán” and “Esquipulas” were used. To analyze the amount of rainfall between these two stations, an average of the two stations was computed using isohyets method. The runoff estimation was calculated using the equation of the SCS CN method.

4 Results and Discussion

4.1 Case I

Overlaying both result maps of Structure 1 and of Structure 2 for Case 1 (Fig. 5a) shows that the classification results by Structure 2 for optimally suitable allocation of 11 points were repeated in Structure 1. When the highly suitable classification was analyzed as second suitable places for RWH, the number of pixels in common were very low. For Structure 1, highly suitable represents 8% while Structure 2 only 4% of all pixels, where 1 pixel represents 500 m2. The overlay results for highly suitable was 0.5% in common.

Fig. 5
figure 5

Rainwater harvesting suitability maps for Case I, Case II, Case III and final suitability map

To identify the difference between the two structures, an error matrix was generated. The comparative results were represented by overall, producer’s, user’s, and kappa’s accuracy, as in Table 5. The producer’s accuracy gives the proportion of pixels correctly assigned for Structure 1, which in this context was roughly 51%. Moreover, user’s accuracy gives the proportion of pixels correctly assigned for Structure 2, which in this case was about 74%. The kappa’s accuracy gives an agreement measure between Structure 1 and Structure 2, which in this case was about 48%, indicating moderate agreement. Similarities between these two structures exist, such as in the not suitable class. The main indicator of the error matrix indicates that the percentage of pixels assigned in this class was high in Structure 1 and Structure 2.

Table 5 Main indicators of error matrix for Case I, Case II and Case III

4.2 Case II

The outcome of overlaying Structure 1 and Structure 2 for Case II (Fig. 5b) shows 46 pixels were shared, representing 1.5% of all pixels. Furthermore, the highly suitable class showed 64 pixels were common, representing 2.1% of total pixels.

All accuracy indicators were very low showed two structures difference evidently due to few pixels in common for each class as in Table 5. The producer’s and user’s accuracy, the proportion of pixels correctly assigned for Structures 1 and 2 were about 42 and 42.6% respectively. Kappa’s accuracy, the agreement measure between Structure 1 and Structure 2 was 30%, showed fair agreement. The similarities between the two structures in their allocation of the five classes were very low.

4.3 Case III

The outcome of overlaying Structure 1 and Structure 2 for Case III (Fig. 5c) of optimally suitable class, indicates that both structures have 6 pixels in common, representing only 0.2% of total pixels. For highly suitable class, 64 pixels were common, representing 2%.

The difference between two structures was assessed using an error matrix. The accuracy indicators, producer’s, user’s and kappa’s were 47%, 48%, and 42% respectively. Kappa’s accuracy indicates the agreement was moderate as in Table 5. shows the percentage of pixels assigned in the same class for Structure 1 and Structure 2 for the five classes. The not suitable class is the group in which both structures are very similar in the allocation of pixels.

Based on main indicators of error matrix, the Case I was with most similarities among the two structures while, the Case II was with very low similarity rating.

4.4 Overlay of Case I, II, III

After AHP implementation for each case and structure, the next step was to overlay the final maps of Case I, Case II, and Case III to identify optimal places among all approaches as a final map (Fig. 5d). The final overlay result provided a good response because these locations were optimal, regardless of the case or structure type. Optimally suitable locations were found just 4 pixels, representing 0.1% of the total area, and 9 pixels were highly suitable, representing 0.3% as in Fig. 5d.

4.5 Amount of Runoff

The amount of runoff varied with watershed soil moisture conditions and depends upon CN and precipitation. In the study area for normal AMC II, dry AMC I and wet conditions, the CN and precipitation exceedance were 78.14, 61.32 and 89.6 mm respectively. The amount of runoff provides helps when estimating the proportion of runoff obtained from the recommended site and to find out the best location for RWH systems. For example, in this research 1 pixel have a size of 500 m2 and there are 4 total pixels (0.1%) of total area, so the RWH systems should be inside that pixels while, the real project size for RWH systems should consider the amount of water required, depending on the needs of agriculture.

5 Conclusions

The critical step of harvesting rainwater for agricultural purpose is to determinate which locations are the best to maximize the amount of water harvested and minimize the ecological impact. Many methodologies exist, some are inadequate for a specific place due to regional conditions and others due to political and social issues. The process of identifying Optimally Suitable places for harvesting rainwater is an important step in promoting runoff harvesting as a potential resource for agriculture. The AHP methodology provides a suitable and straightforward tool for selecting the optimal locations for harvesting rainwater in the study area. Furthermore, the selection of factors based on review suggestions from several previous investigators. The AHP method is flexible in order to change of structure type, threshold values, and relative weights of decision criteria which combine GIS techniques and runoff approaches. In fact, the relative weights of criteria were based on subjective expert preferences in literature and each one has its own justification for assigned value. The sensitivity analysis with changing relative weights for socio-economic and physical features helped to find out common sites from different perspectives.

The novelty of this paper is to overlay both result maps of Structure 1 and of Structure 2 to construct suitability maps for each of the three cases. Furthermore, three cases which based on difference weights assigned to socioeconomic and physical features are overlaid to identify the most suitable site for RWH project. This approach provides a suitable and straightforward tool for selecting the optimal locations for RWH. The results identified four sites as optimally suitable and eight sites as highly suitable. The area of a site is 500 m2. Therefore, a total area of 6000 m2 is choosen in the study area for RWH projects. The total 424,070.81 m3 volume of water can be potentially harvested from these optimally and highly suitable sites. As the size of an Olympic pool is about 2500 m3, so the study area has a potential of harvesting rainwater of 169 Olympic pools. The concept used in this case study is really profitable in term of saving the rainwater. The saved runoff water can be used to maximize the agricultural productivity of the study region.