1 Introduction

The issue of insufficient water supply has emerged as a significant concern in recent times, particularly in semi-arid regions (Yannopoulos et al. 2019). The combination of rapid urbanization, expanding industries, and the impacts of climate change is placing substantial strain on water resources. This emphasizes the pressing need for effective water resource management. Rainwater harvesting (RWH) is recognized worldwide as an economically and ecologically favorable system, providing reliable and accessible water supplies (Al-Abadi et al. 2017). Global climate change presents an unprecedented challenge, as it disrupts annual precipitation patterns and contributes to escalating temperatures, exacerbating the overarching issue of global warming (Pachauri et al. 2014). Regions such as the Middle East and less developed nations are particularly vulnerable due to shifting precipitation patterns, rising temperatures, and prolonged droughts. These changes are unmistakable and emphasize the urgency of addressing climate change (Shahid 2012).

The Intergovernmental Panel on Climate Change (IPCC) and global climate models have raised alarming concerns, predicting a staggering increase in the mean temperature of the atmosphere by the end of the twenty-first century, with estimates ranging from 2.6 to 4.8 °C (IPCC 2013). This alarming trend is expected to result in reduced rainfall and increased evaporation rates, potentially leading to a substantial decrease in surface runoff, with anticipated deficits of 10–30% in semi-arid regions (Hassan 2020).

Given these substantial challenges, rainwater harvesting (RWH) has gained prominence, particularly in water-constrained regions like semi-arid areas. RWH serves as a powerful strategy to enhance water accessibility and availability, addressing both domestic and agricultural needs (Abdullah et al. 2021). The Middle East stands out as one of the region’s most severely affected by global climate change, primarily due to rising temperatures and declining precipitation. Iraq, in particular, is highly vulnerable to climate change and global warming, grappling with multiple environmental challenges, including dwindling water supplies from lakes and rivers, a growing population, and escalating temperatures that further stress water resources and the local ecosystem (Hassan and Nile 2021). Rainwater harvesting (RWH) site selection has been a subject of numerous research studies and approaches worldwide. The utilization of remote sensing (RS) and geographic information systems (GIS) has emerged as an effective and accurate means of identifying optimal locations for RWH systems (Buraihi and Shariff 2015). RWH is recognized as a sustainable design technique for efficient water resource management, and many nations, including those in the Middle East, Southeast Asia, and Africa, have adopted comprehensive plans and invested in technology to collect, store, and utilize rainfall for domestic and agricultural purposes (Ibrahim et al. 2019). The evaluation, management, and planning of rainwater harvesting have gained significance, particularly in regions with limited precipitation (Dhakate et al. 2013). Rainfall is the central and critical variable in rainwater harvesting systems (RWHS), with its temporal variability significantly influencing its efficiency (Silva et al. 2015). Numerous researchers have conducted studies on suitable site selection for RWH, examining regions such as Dohuk Governate, Iraq (Ibrahim et al. 2019), semi-arid areas in India (Garg et al. 2022), the West Bank of Palestine (Shadeed et al. 2020), Pakistan (Khan et al. 2022), and Diyala province, Iraq (Nabit et al. 2023). These studies predominantly rely on historical climate data from the study locations for the analysis of suitable site selection for RWH. For instance, Saha et al. (2021) utilized RS and GIS to identify potential RWH locations in West Bengal, India. Similarly, Shadeed et al. (2020) employed the MCDA approach in combination with GIS and RS techniques to assess the suitability of sites for RWH in Palestine, specifically the West Bank. Aghaloo and Chiu (2020) utilized GIS, RS, and weighting Fuzzy layers to identify suitable site selection for RWH in Iran as a semi-arid region. They concluded that the integration of the multi-criteria method with GIS and RS proved highly effective in water resources management, saving time on a long-term scale. Moreover, Mitabet al. (2023) utilized MCDA and AHP methods to manage the selection of suitable sites of landfill sites in Kirkuk City. Mohammed et al. (2023) also mentioned MCDA for a review study about groundwater recharge methods.

Despite various studies on Rainwater Harvesting (RWH) have been conducted, there is a limitation to studying the influence of changing climates on RWH, especially in arid and semi-arid locations. A more comprehensive evaluation and feasibility study of water resource management in these areas is required. Climate change poses a significant threat to global food and nutrition security, and unfortunately, it is expected to exacerbate these challenges (Ali et al. 2020). It is imperative that more in-depth studies on recent climate scenarios be conducted, employing both acceptable temporal and spatial resolutions, in order to precisely quantify the impacts of climate change (Mubeen et al. 2020). The most recent climate scenarios provided by the Coupled Model Intercomparison Project (CMIP) include the fifth phase (CMIP5) with a spatial resolution of 200 km (Ali et al 2019) and the sixth phase (CMIP6) with even higher spatial resolution at 100 km (Kheir et al. 2023). The Coupled Model Intercomparison Project (CMIP) has found extensive application in various global studies, offering valuable insights into a multitude of critical areas. These include investigations into monsoon pattern alterations, assessments of offshore wind energy resources, projections regarding crop yield, and the simulation of precipitation patterns (Shiruet al. 2022). Additionally, CMIP has contributed to our understanding of mean sea-level changes, sea surface temperature variations, and land surface air temperature simulations providing valuable insights into recent warming trends (Carbalho et al. 2022). Furthermore, CMIP's reach extends to regional and continental analyses. It has been pivotal in simulating climatological temperature and precipitation for regions like Southeast Asia, characterizing future drought patterns across South Asia, modeling extreme precipitation events in the USA, and estimating evapotranspiration rates across Africa (Nooni et al. 2021).

For instance, Wallace et al. (2015) utilized daily precipitation data that was statically downscaled from Global Climate Models (GCMs) to estimate and select suitable catchment areas. Similarly, Lo and Koralegedara (2015) incorporated statically downscaled precipitation data along with historical data from GCMs in their assessment of reliability and suitable site selections for RWH. To promote RWH, creative technological approaches have been employed to tap into the most abundant water resource available. Prioritizing, assessing, and preparing for RWH strategies become especially significant in regions with minimal rainfall (Al-Ghobari and Dewidar 2021). Finding potential RWH zones in semi-arid regions is essential for improving water availability and land productivity. For this, a variety of rainfall-runoff models have been utilized, such as the model based on an artificial neural network. Several approaches are frequently used in the field of water resource management, including the Soil Conservation Service Curve Number (SCS-CN) method, the rational strategy, and the use of the water stability strategy (Muneer et al. 2020).

Geographical information systems (GIS) and remotely sensed data (RS) technologies offer powerful tools for analyzing and aggregating data from multiple criteria in the identification of rainwater harvesting zones. The multi-criteria decision analysis (MCDA) approach, along with GIS and RS, is frequently employed to integrate data from diverse criteria (Krois and Schulte 2014). The analytical hierarchical process (AHP) and Weighted Linear Combination are valuable decision-making tool that enables the systematic organization and analysis of composite judgments using both expert knowledge and mathematical principles (Saaty 2008). Preeti et al. (2022) employed GIS and WLC to identify suitable sites for rainwater harvesting structures in Australia, and Greater Western Sydney, and the study was very useful.

However, many of these studies focused primarily on identifying potentially suitable sites for RWH systems and structures, without specifically considering future climate change and its effect on RWH sustainability especially in semi-arid regions that face water poverty and scarcity. As a result, there is a need for further research that integrates GIS, RS, and MCDA techniques to identify suitable RWH locations for sustainable water management, particularly in semi-arid regions of the Middle East.

This study aims to address this research gap by developing and applying an integrated methodology that combines these techniques in the northern part of Iraq, a semi-arid country in the Middle East. The goal is to enhance future water resources management and establish new policies for identifying suitable RWH sites. This approach can be utilized by government entities and investors to promote sustainable water management that can help in developing water resources management and facing water sacristy soon. This study suggests a novel method for evaluating rainwater harvesting (RWH) while taking the effects of climate change into account to solve these complicated concerns, strive toward enhanced management of water, and ensure water's future sustainability and forecasting future of rainwater patterns under the climate change scenario to avoid future water sacristy risks in the near and far future. The approach incorporates the use of remote sensing (RS) and the integration of Geographic Information System (GIS) technology in cooperation with LARS WG to identify suitable RWH sites under climate change scenarios and to forecast future precipitation patterns in the study region to make a better decision regarding future water resources management and to maintain suitable sustainable resolutions regarding water security and helping the managers and engineers in the region to work towards enhancing water resources management and water sustainability and implementing new water policies within the rapid effect of climate change.

2 Study area and methodology

2.1 Overview of the study area

This study was conducted in Kirkuk province, which is located the in northern region of Iraq and bounded to the northeast by the Zagros Mountains and to the west by the Lower Zab and Tigres Rivers, which are recognized for their mild flow (Abdulrahman 2018). Kirkuk is roughly 250 km (155 miles) from Baghdad, Iraq's capital and biggest city. The location of the study area is 35° 28′ N to 35° 47 latitudes and 44° 24′ E to 44° 40′ E longitude and elevation of 350 m, as represented in Fig. 1, a thoroughly drawn map representation of the study area. Despite its modest dimensions, the province encompasses a land area of approximately 9679 square kilometers, a fraction comprising 2.2% of Iraq's vast terrain. This seemingly tiny province, however, commands significant attention within the agricultural land and the average annual temperature is 28 °C (Al-Abadi et al. 2016).

Fig. 1
figure 1

Study area of Kirkuk

The northeastern highlands of Iraq commence from the southern province of Kirkuk, extending to the borders of Turkey and Iran. Additionally, Kirkuk comprises three distinct neighborhoods: Daquq, Al-Hawiga, and Dibis. According to estimates from the Iraqi government, Kirkuk's population in 2007 was 902,019 people or about 3% of all people in Iraq. Kirkuk's climate can be described as warm and semi-arid, where bitterly cold and damp winters blend with sportingly hot and parched summers. The annual average precipitation equals 360 mm per year with a rainy season lasting from October until the end of April (Buraihi and Shariff 2015).

2.2 Dataset collection and generation

The climate data used for this study consisted of daily rainfall records obtained from the Iraqi Meteorological Organization and Seismology Department, spanning the years 1980 to 2022. These records were provided in the form of monthly metadata. To pile up rainfall data, a thorough compilation was undertaken, sourcing information from diverse outlets including the Iraqi Meteorological Department, and the Ministry of Agriculture. This dataset was subsequently cross-referenced with NASA meteorological records, spanning a 42-year period from 1980 to 2022. In addition, the digital elevation model for the year 2023 was sourced from the official Website of the United States Geological Survey (USGS) as a raster file, which was utilized to generate both the slope model and surface drainage data for the study area. This process was executed using Arc Hydro tools within the ArcMap 10.8 application. Furthermore, the hydrologic soil groups (HSGs) were sourced from the Official website of the United States Department of Agriculture's records as a TIF file for the year 2009, which was then cross-referenced with an earlier soil map from the Iraqi Ministry of Agriculture (USDA 2009). To visually represent the historical precipitation, and mean monthly amounts in millimeters for the study area, LARS WG 7.0 was employed. This information is presented in Fig. 2. Additionally, Fig. 3 showcases the geographical distribution of selected rainfall stations within the study zone. Table 1 provides a comprehensive summary of the key details and data related to these selected rainfall stations in the study area.

Fig. 2
figure 2

The historical precipitation of Kirkuk

Fig. 3
figure 3

Rainfall stations located in Kirkuk

Table 1 Data of the selected rainfall stations

3 Methodology

Utilizing the Analytical Hierarchy Process (AHP) methodology in conjunction with a pairwise comparison matrix for assigning weights to each criterion, the primary objective of this study is to pinpoint optimal locations for rainwater harvesting (RWH). The ultimate aim is to identify prospective sites for RWH, and in this pursuit, we rely on six pivotal criteria, as per the recommendations from the Food and Agriculture Organization (FAO). These criteria encompass surface runoff potential, topography (slope), land cover/use, drainage density, rainfall data, soil texture, and hydrological soil groups (HSGs). In order to assemble and scrutinize the indispensable data and to delineate the watershed and stream order for Kirkuk Governorate, a digital elevation model (DEM) was procured from the United States Geological Survey (USGS), specifically the Shuttle Radar Topography Mission (SRTM) data, boasting a resolution of 30 m as TIF file. Furthermore, we harnessed high-resolution land cover information, retrieved from the latest available imagery on the ESRI website (https://livingatlas.arcgis.com/landcover/), offering a remarkable resolution of 10 m.

The data used in this study was acquired through remote sensing and processed with the aid of ArcGIS software. Subsequently, it was reclassified to align with the specific research area within Kirkuk Governorate. The rainfall dataset utilized spans a substantial time frame, encompassing 42 years (1980–2022). Furthermore, for a comprehensive understanding of climate change impacts in the study area, we employed the LARS weather generator model (LARS WG 7.0) to generate and analyze future climate projections for rainfall data over a century. Figure 4 provides an overview of the methodological framework employed for the selection of suitable sites under the context of climate change in Kirkuk Governorate.

Fig. 4
figure 4

Adopted methodology framework

3.1 Data input and processing

3.1.1 Digital elevation model (DEM) and slope map

In certain regions, the digital elevation model (DEM) serves as a representation of the two-dimensional topography of the Earth. DEMs, or Digital Elevation Models, are generated from regularly spaced elevation data extracted from aerial photography and geographic maps, as outlined by Gavit et al. (2018). In our study, we harnessed elevation data provided by the NASA Shuttle Topography Radar Mission (SRTM), which boasts a spatial resolution of 30 m as a raster file. This data was used to create the DEM. The digital elevation model furnishes vital insights into various topographic features, including slope, flow direction, flow accumulation, hill shade, and stream order within the basin. These data were processed and analyzed using algorithmic techniques within GIS software. Notably, the slope plays a crucial role in influencing sedimentation levels, water flow velocity, and the necessary materials for constructing dams, as emphasized by Adham et al. (2016). For rainwater harvesting (RWH) sites, adherence to specific recommendations is crucial. These guidelines suggest that RWH areas should ideally feature slopes of no more than 5%. This limitation arises from the potential for high erosion rates in areas with steeper slopes, resulting from unpredictable runoff distribution and the need for extensive earthworks (Critchley et al. 2013). To support our analysis, we developed a slope model utilizing a 30-m resolution DEM obtained from the United States Geological Survey (USGS) website. Spatial analyst tools within ArcGIS 10.8.1 software was used to construct the slope map model for the study area. The slope values were then reclassified into five distinct elevation categories to create the final model. Figures 5 and 6 illustrates the DEM and slope models for study area respectively.

Fig. 5
figure 5

DEM for the study area

Fig. 6
figure 6

Slope of the study area

3.1.2 Soil types and texture

The rate of infiltration and the volume of surface runoff are intricately tied to the composition of soil, specifically its texture. Soil texture is determined by the proportion of sand, silt, and clay, collectively categorizing the soil into different textural classes. For rainwater harvesting (RWH) systems, fine- and medium-textured soils are typically recommended due to their higher water retention capacity. RWH is particularly well-suited to soils that possess a greater capacity for water storage (Sayl et al. 2022). Clay soils, with their low porosity and impressive water-holding abilities, are especially favorable for water storage. When planning RWH systems, the choice of location is heavily influenced by the soil's texture, especially when the aim is to store water for the benefit of people, animals, and crops (Glendenning et al. 2012).The United States Department of Agriculture (USDA) has been instrumental in providing hydrologic soil groups (HSGs), which categorize soils into four hydrologic groups based on their ability to facilitate infiltration and water transfer through the soil (USDA 2009). Table 2 showcases the various hydrologic soil groups (HSGs) for the study area. We reclassified these hydrologic soil groups based on the specific soil texture characteristics of the study region. Figures 7 and 8 provide a visual representation of the soil textures and their corresponding HSGs within the study region.

Table 2 The hydrologic soil groups
Fig. 7
figure 7

Soil texture of the study region

Fig. 8
figure 8

HSGs in the study region

The land cover within a specific region plays a significant role in influencing the runoff generated by raindrops. Vegetation with higher density exhibits superior interception and infiltration rates, ultimately resulting in reduced runoff (Sameer et al. 2023).

To better understand the dynamics, we utilized a maximum-likelihood algorithm technique to classify the means, variances, and covariance of a land cover raster for the year 2021. This analysis was conducted at a spatial resolution of 10 m. The land cover was categorized into eight distinct classes, including water, trees, grass, flooded vegetation, agricultural land, scrub/shrub, built-up areas, and bare terrain. The reclassification of these categories culminated in the creation of the final land cover model. For a visual representation of the land cover and land use within the study region, as shown in Fig. 9.

Fig. 9
figure 9

Land cover/use for the study region

3.1.3 Rainfall

The Food and Agriculture Organization (FAO) emphasizes that rainfall is a pivotal factor significantly impacting runoff, particularly in arid and semi-arid regions (FAO 2003). For effective rainwater harvesting (RWH) in such areas, the FAO recommends a minimum average annual rainfall ranging from 300 to 600 mm (Zakaria et al. 2012; Oweis et al. 1999). In our study, we collected rainfall data spanning 42 years, from 1980 to 2022, courtesy of the Iraqi Meteorological Department. To spatially distribute this data accurately, we employed the inverse distance weighting approach, a robust method. Furthermore, we utilized ArcMap tools to reclassify and process the data, in alignment with the methodology proposed by Bashir and Fouli (2015). Rainfall is related to uncertainty, and probability distributions formally and quantitatively describe the pattern of uncertainty. The rainfall data were cross-referenced with NASA Power Project data and the Ministry of Agriculture in Iraq to minimize any inaccuracy and to check missing data and consistency as well.

3.1.4 Stream order and drainage density

The relationship between stream order and permeability, as well as infiltration drainage, is an essential factor to consider in the context of rainwater harvesting (RWH). In general, lower stream orders are associated with higher permeability and greater infiltration capacity, while higher stream orders tend to exhibit the opposite characteristics. Therefore, the precise measurement of stream order and drainage density is of paramount importance when implementing RWH systems. To quantitatively express drainage density, we can turn to Eq. (1), a mathematical representation defined as the total stream length per unit catchment area, as discussed by Rai et al. (2017).

$$DD = \mathop \sum \limits_{i = 1}^{n} L/A$$
(1)

where L stands for the stream's length in kilometers, n for the number of streams, A for the drainage area in kilometers, and DD for the density of the drainage.

Finding suitable RWH locations involves taking the runoff into account. The potential water supply during runoff is determined by using the runoff depth. The soil conservation service method is used to compute the curve number (CN). SCS-CN, which is used to calculate the runoff depth model, is made available by the USDA, 2020 (Matomela et al.2020). For each pixel in the study area, the land use/land cover and soil-texture group model were used to create the curve number (CN).

The following formula could be used to express runoff depth in runoff Eq. (2):

$${\text{Q}} = \frac{{\left( {p - Ia} \right)^{2} }}{{\left( {P - Ia} \right) + S}}$$
(2)

Q is the runoff depth (mm), P is the precipitation amount (mm), S is the potential retention just after the start of runoff (mm), and Ia is the initial abstraction (mm) that considers all losses just before the start of runoff, infiltration, evaporation, and water intercepted by vegetation (Ibrahim et al. 2019). Utilizing the result of studying the rainfall data for several small agricultural basins, (Ia) = 0.2S (Melesse and Shih 2002). In conclusion, Eq. (3) could be stated as:

$${\text{Q}} = \frac{{\left( {p - 0.2S} \right)^{2} }}{{\left( {P + 0.8S} \right)}}$$
(3)

Equation (4) can be utilized to calculate the theoretical maximum retention (S) using the estimated curve number (CN) as follows:

$$S = \frac{25400}{{CN}} - 245$$
(4)

CNs (Curve Numbers) have no dimensions and range from 1 to 100. A CN of 100 indicates poor infiltration and excessive runoff, whereas a CN of 1 suggests excellent infiltration and low runoff. High CNs suggest that surface runoff will account for a significant portion of the rainfall (Krois and Schulte 2014). The stream order model was created by utilizing ArcGIS 10.8.1 software and reclassified into six classes. According to Ammar et al. (2016) the more stream orders the better drainage density and more suitability for RWH. While the drainage density model was classified into five categories: very poor, poor, moderate, good, and very good. Figures 10 and 11 illustrate the stream order and drainage density models for the study region respectively.

Fig. 10
figure 10

Stream order for the study region

Fig. 11
figure 11

Drainage density for the study region

3.1.5 Generation of forecasting weather data and climate scenario

In order to project future precipitation, we adopted a comprehensive approach, encompassing both current and prospective climatic scenarios. To accomplish this, we harnessed the Long Ashton Research Station Weather Generator (LARS-WG), a specialized tool specifically tailored for assessing climate change impact by generating daily time series data, as thoroughly detailed by Khalaf et al. (2022).

In our quest to mitigate uncertainty in climate projections and enhance the accuracy of downscaling temperature and precipitation data, we deployed an upgraded iteration of LARS-WG, specifically version 7.0. This advanced version was harmonized with data from five Global Climate Models (GCMs), thoughtfully presented in Table 3.

Table 3 The global climate models selected from IPCC AR6

The prediction of climate parameters, such as rainfall and temperature, was executed using a weather generator that capitalizes on daily observed weather data from specific sites. This data formed the foundation for constructing a set of parameters for probability distributions and meteorological parameter correlations, a process expertly orchestrated through the utilization of LARS-WG models, as outlined by Semenov et al. (2013).

The climate prediction and generation scenarios for a specific future period were meticulously constructed. This was anchored in a baseline parameter, meticulously calculated from observed weather data for Kirkuk Governorate during the period spanning 1980–2022. These parameters were then combined using the Coupled Model Intercomparison Project Phase 6 (CMIP6), a prominent component featured in the IPCC multi-model sixth report, and seamlessly integrated into version 7.0.

Model performance was applied to check the model by using LARS WG 7.0 using input data for daily metrological data for all stations in the study region and the location with the information of CO2 to generate and verify the required files for the next phase for site analysis to generate three site variables file for the study region.

3.1.6 Site suitability of RWH and criteria selection

Each phase in the creation of a rainwater harvesting (RWH) suitability map was meticulously conducted using ArcGIS 10.8.1. Employing a Weighted Linear Combination (WLC) approach, complemented by the Analytical Hierarchy Process (AHP), our suitability model generated RWH compatibility maps by integrating a range of criteria, as emphasized by Yin et al. (2020). WLC, a well-established Multi-Criteria Assessment (MCA) technique, is instrumental in the process of selecting suitable RWH sites.

This comprehensive methodology encompasses several critical steps, starting with the standardization of suitability maps, followed by the allocation of relative significance to these maps. The culmination of these steps involves blending these weighted factors with consistency maps to yield a suitability score, a process detailed by Al-Hanbali et al. (2011). The model hinges on the fundamental principle of weighted averages, where each factor's weight is multiplied by its corresponding consistency score.

The following equation was employed to calculate the WLC suitability score for the map:

$$S = \sum RiWi \times SLi$$
(5)

In the provided equation, ‘S’ signifies the suitability score for site selection, while ‘RiWi’ represents the relative weight significance attributed to input layers, including factors such as rainfall, slope, drainage density, soil characteristics, stream order, land cover/land use, and soil texture type. The variable ‘SLi’ reflects the degree of acceptability or suitability for each cell within the input layer. Higher ‘SLi’ values within cells indicate a stronger suitability for rainwater harvesting (RWH) systems. The suitability score (‘S’) is categorized into four distinct classes: ‘low suitability,’ ‘medium suitability,’ ‘high suitability,’ and ‘extremely high suitability.’ This categorization strategy has been applied in various previous research studies, exemplified by the work of Naseef and Thomas (2016).

The ultimate determination of weights assigned to each criterion, the prescribed values for criterion classes, and the establishment of buffering zones constitute the primary criteria for calculating site suitability levels for RWH systems. This layer is subsequently combined with the output layer produced through the Weighted Linear Combination (WLC) method, culminating in the creation of the definitive RWH suitability map. In the development of the final model and map for selecting suitable RWH sites, criteria were identified based on their significance and influence within the model.

Among the influential factors are rainfall (RF), rainfall-runoff depth (RD), slope (SL), soil texture (ST), drainage density (DD), and land cover/land use (LC), with rainfall ranking as the most influential. We harnessed the Weighted Linear Combination (WLC) method alongside the Analytical Hierarchy Process pairwise comparison by weights (AHP), an approach described by Satty (2008), to ascertain the weights for these diverse layers. Table 4 provides insight into the weight comparison matrix values utilized to determine suitable site selections for RWH.

Table 4 The comparison weight matrix values to identify the suitable site selection for RWH

The Food and Agricultural Organization (FAO) has recognized six pivotal parameters for evaluating site selection in soil water conservation: climatic, hydrologic, topographic, agronomic, soil types, and socioeconomic factors. Informed by the wealth of local expertise and supported by an extensive body of literature, each criterion employed in the RWH suitability maps was categorized and normalized into six distinct value classes and assigned a score ranging from 1 to 10, as depicted in Table 5. As a result, scores approaching 10 indicate a high degree of suitability, while a score of 1 conveys a lower degree of suitability. The scores were processed using Geographic Information System (GIS) techniques, categorizing the various criteria (layers) in accordance with their respective values. The combination of these criteria provides a comprehensive framework for identifying and prioritizing potential RWH sites, and the rating of criteria selection was constructed based on the previous literature, previous studies in the region, the nature of the study area, and dataset availability with a combination of experts' opinions.

Table 5 The criteria classes score ratings and values

3.1.7 Model downscale and future projection

We assessed the model's performance using LARS WG 7.0, utilizing daily meteorological data from all stations within the study area. This involved incorporating data on CO2 levels to generate and validate the necessary files for the upcoming site analysis phase, which in turn yielded three distinct site variable files for the study region. These files are categorized into three types:

The data is divided into three files: wgx, stx, and tst. The wgx file contains information about the study region, the stx file provides statistical outcomes, and the tst file compiles statistics for both observed and generated data to facilitate comparison.

In Table 6, you’ll find an evaluation of the model's performance in simulating seasonal precipitation variations, both in wet and dry conditions. Table 7, on the other hand, focuses on the distribution of daily rainfall.

Table 6 K–S test for seasonal wet/dry serious distribution
Table 7 K–S test for daily rain distribution

During the winter season (DJF), the model demonstrates exceptionally good performance. In the spring season (MAM), it accurately replicates wet spell distribution and performs well for dry conditions. Moving on to the summer season (JJA), the model's performance is mixed, showing areas of excellence in dry conditions but struggling with wet spells. In the autumn season (SON), the model consistently performs exceptionally well for both wet and dry spells.

Figure 12 visually represents the monthly average precipitation for both observed and future projections within the study region. It is evident from the illustration that there are notable differences in precipitation levels, with a clear decrease in future projections.

Fig. 12
figure 12

Monthly average precipitation for observed and future projection

4 Results

4.1 Future and historical rainfall spatial distribution

Figures 13 and 14 provide a visual representation of the historical annual average precipitation in the Governorate of Kirkuk over a span of 42 years, alongside the average annual precipitation projection for the future period spanning from 2031 to 2100. This precipitation map vividly depicts the spatial distribution of rainfall across the study region. To reduce uncertainty a normality distribution test was applied for the average monthly observed rainfall data by applying the Empirical CDF normality distribution test as shown in Fig. 15. The figure reveals there is some skewness in some points and shifting in rainfall spatial distribution for the rainfall in the tested stations. This information with probability can be helpful for the spatial distribution of rainfall data and uncertainty in output largely depends on the uncertainty in input. The percent above 55% of the normality was selected to be fitted for this study. In this visual representation, high precipitation values are prominently concentrated in the central and western regions of the research area, while lower values are clustered in the northeast. The data from Fig. 10 indicates that the maximum recorded precipitation value was 490 mm, whereas the minimum recorded was 270 mm. Notably, the average annual precipitation projection for the future exhibits a distinct spatial distribution. In this projection, maximum values are concentrated in the southwest of the study area, with minimum values found in the northeast part of the study region. These values vary, with the lowest at 255 mm and the highest at 390 mm. These significant changes and variations in the average annual rainfall values over this extended period underscore the influence of climate change on precipitation levels in the study region.

Fig. 13
figure 13

Historical average precipitation of study region

Fig. 14
figure 14

Future projection precipitation of the study region

Fig. 15
figure 15

Empirical CDF normality distribution for rainfall data

4.2 Potential runoff depth

Runoff depth estimation is considered one of the most important factors in identifying suitable site selection of RWH, therefore integrating the Curve number CN grid values with the land use/cover for each polygon. Moreover, the pixel value was applied while generating the RWH potential map, the runoff was computed, and the model was constructed for current and future runoff estimation models for the study region. The annual runoff was divided into five categories: extremely poor (190 mm), poor (200–250 mm), moderate (290–340 mm), good (350–400 mm), and very good (> 400 mm). Places with very good to moderate potential runoff make up most of the watershed region. While the forecast future runoff of the study area was calculated based on future precipitation downscaling values and the future runoff model was generated and classified according to the study area into five classes, the lowest value was 180 mm while the highest one was 390 mm and varied across the study region and it is clear that the amount of surface runoff for the future prediction is lesser due to the reduced amount of rainfall that is influenced by the climate change. Figure 16 illustrates the validation model for the rainfall runoff in the study region of Kirkuk Governorate. Annual runoff depth model results for Kirkuk Governorate are shown in Figs. 17 and 18 in millimeters respectively.

Fig. 16
figure 16

The validation model for the rainfall-runoff

Fig. 17
figure 17

Historical runoff for the study area

Fig. 18
figure 18

Future runoff depth of study area

4.3 Final suitability models for RWH for historical and future projection

The Rainwater Harvesting (RWH) potential map model for the research location was meticulously crafted using ArcMap, utilizing thematic layers encompassing slope, runoff, land use/cover, soil texture, rainfall, and drainage density. Each thematic layer's capacity to retain rainwater was considered during the assessment of their attributes.

The approach employed a combination of methodologies, including the Weight Overlay method, Analytical Hierarchy Process (AHP), and Multi-Criteria Assessment (MCA). Pairwise comparison matrices were established for the assigned weights of each thematic layer, once their weights and essential attributes were determined. The normalization of given weights was accomplished through the eigenvector technique, and the weight for each thematic layer was computed using Eq. (5). The weight of each layer was calculated and it is clear that the rainfall has the main effect of 38% of the weight while the runoff depth has around 25% of the effectiveness on the layers, the slope has 16% of the total effectiveness weights, hence, soil texture has 10% while drainage density and land cover/use have 7% and 3% respectively. Figure 19 depicts the weight percentage of each thematic layer. The suitability zones for RWH were classified into four distinct categories: 'very high suitability,' 'high suitability,' 'moderate suitability,' and 'low suitability,' as visually represented in Figs. 16 and 17 for both historical and future projections of RWH suitable site selection.

Fig. 19
figure 19

The weight percentage of each thematic layer

In the historical potential RWH zones, the distribution was as follows: ‘very high suitability’ covered 8.2%, ‘high suitability’ encompassed approximately 22.6%, ‘moderate suitability’ extended across about 37.4%, and ‘low suitability’ accounted for 31.8% of the study region as shown in Fig. 20. For future projections of RWH potential zones, the distribution altered slightly: ‘very high suitability’ covered approximately 6.1%, ‘high suitability’ encompassed around 18.3%, ‘moderate suitability’ extended across approximately 31.2%, and ‘low suitability’ accounted for around 44.4% of the study region, as visually depicted in Fig. 21.

Fig. 20
figure 20

RWH potential site selection for historical rainfall

Fig. 21
figure 21

RWH potential sites selection for future projection

5 Discussion

This study was conducted with the primary goal of identifying and locating suitable locations for rainwater harvesting to support water resources management for current and future projections in a semi-arid environment. To achieve this, a comprehensive methodology was employed to assess suitable sites for Rainwater Harvesting in the study region. For instance, Ibrahim et al. (2019) utilized the Multi-Criteria Elevation (MCE) method in conjunction with GIS to identify suitable RWH sites in the northern region of Iraq's Dohuk governorate, revealing that only 15% of the area represented highly suitable sites for RWH. Similarly, Al-Ghobari et al. (2021) employed GIS and remote sensing to identify suitable RWH sites in the semi-arid region of Wadi Nuaam, Saudi Arabia, where rainwater harvesting was feasible across 40% of the study area. In another case, Khan et al. (2022) utilized GEO-HEC and GIS techniques to identify suitable structural sites for RWH in Pakistan's semi-arid regions. Throughout a 42-year span, the study area experienced fluctuations in rainfall values, with a recent decline in the average yearly precipitation. The spatial distribution of historical rainfall within the study area ranged from 270 to 490 mm annually and 255 to 390 mm for future projection respectively. Given the significance of rainfall data, steps were taken to address the associated uncertainty. A normality distribution test was applied, and only data with normality exceeding 55% were incorporated into the research to mitigate inaccuracies hindering variations in rainfall distribution. Rigorous validation of rainfall data was undertaken to enhance data accuracy. A rainfall-runoff correlation relationship test was applied to the study region, revealing a robust positive correlation (with an R2 value of 95% and 96.83%) between the two variables, underscoring their substantial impact on water resources and the assessment of potential rainwater harvesting. The Weighted Linear Combination WLC was employed to assign weights to various thematic layers. This approach provided a high degree of flexibility in weight allocation. Moreover, the inclusion of consistency checks for all specified themes within the WLC framework enhanced criteria selection and precision in delineating suitable zones for RWH. Notably, rainfall carried the greatest weight at 38%, signifying its highest importance in identifying suitable RWH sites. Runoff depth, accounting for 25%, emerged as the second most influential factor due to its substantial impact on site suitability. Slope, with a weight of 16%, also played a significant role, while soil texture with a weight of 10% particularly given the presence of HSGs (Hydrologic Soil Groups) C and D, indicating sufficient runoff and low infiltration. Drainage density and land cover/use contributed 7% and 4%, respectively, to the overall weightings, reflecting their relevance to the study area. The study underscores the effective fusion of WLC with remote sensing, GIS technologies, and downscaling for future projection by using LARS WG and global climate models for the study area. This integrated approach was successfully employed to identify appropriate RWH sites across the study region. Remarkably, Waghaye et al. (2023) utilized AHP and GIS techniques in conjunction with remote sensing to identify suitable RWH sites in India. Similarly, Saha et al. (2021) employed a runoff model alongside RS and GIS techniques to identify suitable sites in West Bengal, utilizing thematic layers within a multi-criteria decision analysis (MCDA) framework. Ultimately, the findings of this research have practical applications for water resources management and future water sustainability. The findings can be employed to support water resources endeavors and alleviate water stress within the study region and can contribute to minimizing water shortages specifically in semi-arid zones, such as Iraq that have been suffering from severe climate change and drought, particularly in recent decades to improve the future water resources management in the region. The approach needs a field visit in near future and the methodology of this paper can help with selection and reduce design costs at this stage but with more collaboration of Iraqi government authorities such a ministry of water resources and management in near future more field visits can be accomplished,

6 Conclusion

Rainwater harvesting (RWH) serves as a promising solution to combat water scarcity challenges and bolster water supply, especially in the context of shifting climate patterns. In this research, we tackled this issue by employing a geographic information system (GIS)–based multi-criteria analysis (MCA) technique in conjunction with the LARS WG (weather generator). Our aim was to generate RWH projections and establish a robust methodology for identifying potential RWH sites, both for the present and future development plans.

Throughout this study, ArcGIS emerged as a pivotal tool for amalgamating data from various thematic layers to pinpoint relevant locations. The process of identifying optimal areas for collecting and storing rainwater demands access to a versatile, efficient, and comprehensive data source, with ArcGIS serving as an exemplar in this regard. Additionally, the integration of GIS technology with LARS WG 7.0 for downscaling rainfall for future projections under diverse climate scenarios proved to be a highly effective and innovative approach. This integration holds particular significance for regions and nations grappling with persistent water stress and severe climate change impacts, such as Iraq. The primary objective of this research study is to advance water management and sustainability over the long term by introducing novel methods and strategies. Key variables, including slope, stream order, drainage density, land use/cover, soil types, and historical and future rainfall data, were considered in our analysis.

Our findings reveal that for the current potential site selection, 22.6% and 8.2% of the areas are deemed of high and very high suitability, respectively, while 37.4% are of moderate suitability, and 31.8% are of low suitability. However, the potential zones of RWH for future projections exhibit a shift, with very high suitability covering around 6.1%, high suitability at approximately 18.3%, moderate suitability at around 31.2%, and low suitability making up about 44.4% of the study region. These changes reflect the influence of climate change on rainfall-runoff amounts and, in turn, impact site suitability percentage zones.

The findings of this study hold profound significance for water management experts and decision-makers in the studied area, offering a beacon of hope in the face of mounting water challenges. As water resources become scarcer, these insights become invaluable, guiding us toward the efficient utilization of this precious and limited resource. The successful identification of suitable RWH sites has been made possible through a groundbreaking integration of Geographic Information Systems (GIS), remote sensing technology, and the LARS WG weather generator. This synergy showcases the potential of interdisciplinary approaches, underlining that the most complex challenges demand the combined prowess of technology, data, and innovative methodologies. The analysis and methodologies unveiled in this study transcend borders and boundaries, offering a blueprint for water resource management not only in the studied region but also in similar semi-arid areas worldwide. Nations grappling with water scarcity under the relentless shadow of climate change can adopt and adapt these strategies to safeguard their water resources, bolster resilience, and foster sustainable development. This study is but a steppingstone in the journey to face the ever-pressing issue of water scarcity. It beckons for more investigations, studies, and collaborative efforts. The innovative explore must continue, solutions, harness advanced technology, and leverage interdisciplinary approaches to ensure the long-term sustainability of our water resources.