Spatio-temporal calibration of Hargreaves–Samani model in the Northern Region of Nigeria

One of the significant components of the hydrological cycle is evapotranspiration. Monthly meteorological parameters of 35 years from 19 meteorological stations across the Northern Region of Nigeria (NRN) were obtained and utilized for the calibration of Hargreaves–Samani (HS) model by comparing between potential evapotranspiration (ETo) values estimated from the original HS and the Penman–Monteith (FAO-56 PM) models. The calibrated HS equation was assessed using trend patterns and some statistical indices. The average value of root mean square error (RMSE) and the mean absolute error (MAE) decreased by 37.1 and 40%, respectively, after the calibration of the model. Also, the correlation coefficients (R) of stations that had values > 0.8 increased from 6 to 11 and the minimum R value increased by 12% above that of the original HS equation. The trend and spatial map of the statistical tests conducted also indicate better performance in most climatic regions after calibration. The precision of the HS equation improved significantly after calibration for semi-arid, humid, and sub-humid regions. However, few stations in the semi-arid, humid, and sub-humid regions did not show drastic improvement due to the peculiarity of the location and high variations in the wind speed and relative humidity parameters.


Introduction
The necessity to efficiently manage the water resources and the agricultural sectors in a developing country where subsistence farming is the most available job is very important and must be given priority in order to avert serious economic and environmental challenges. It is even more pertinent if the area under consideration is classified under arid or semi-arid regions (Jerin et al. 2021;Salam and Islam 2020).
The issue of climate change which has led to more drought occurrences in many parts of the earth, most significantly in the Sahel region of West Africa, in recent decades calls for efficient management of available water resources (Ibebuchi 2021;Dong and Sutton 2015;Lele and Lamb 2010;Dai et al. 2004;Sultan and Janicot 2000). Evapotranspiration (ET) is very essential to quantify the amount of water that will be needed to sustain a crop especially when rainfall is not sufficient. The accurate measurement of changes in ET is very important for modeling ecological and hydrological changes and natural plant habitat (Heydari and Heydari 2014).
ET is a significant parameter in calculating the energy and water balances on the surface of the earth. Understanding this parameter on a spatiotemporal scale will provide answer to some pertinent questions in climatology, hydrology, agronomy, and ecology (Animashaun et al. 2020;Rivas and Caselles 2004). In addition, the knowledge of ET is key for strategizing a viable economic use of the available water resources (Tellen 2007). Efficient irrigation water requirement is usually estimated using potential evapotranspiration (ET o ). Therefore, many irrigation experts make use of ET o and crop coefficients (Kc) to compute the irrigation water required for different crops (Tellen 2007). Despite the fact that there are many model equations for calculating ET o from meteorological data, there happened to be no rigid agreement on the acceptability of any particular model for a climatic region, but Penman-Monteith (FAO-56 PM) model is referred to as standard because of its good accuracy and consistency when compared with other models (Allen et al. 1998). The main benefit of the FAO-56 PM model is that it has taken into consideration the principles of aerodynamics and thermodynamic components of the climate system. The FAO-56 PM model has been adopted and used for many climatic regions (e.g., Mohammadi and Mehdizadeh 2020;Ogunrinde et al. 2020a;Li et al. 2018;Tellen 2007;Heydari and Heydari 2014;Ventura et al. 1999). Although the FAO-56 PM model has a general acceptability for many climates, the need for many weather parameters that are usually not available at many meteorological stations, especially in the developing countries, have limited its usage (Pereira and Pruitt 2004).
It has been observed that the Hargreaves-Samani (HS) model of 1985, which uses only temperature data, has the tendency to provide acceptable ET o similar to that of FAO-56 PM model for arid region. However, there is a need for the local calibration of this model in humid, sub-humid, and semi-arid climatic regions (Zhu et al. 2019;Gavilan et al. 2006). It has also been established that HS model usually overestimate ET o in humid climate area and underestimate in arid regions, thereby making it not too effective in managing water resources without calibration (Trajkovic 2005).
Several studies have showed the efficiency of the HS model in relation to FAO-56 PM model for estimating ET o . Hargreaves (1994) compared the ET o data collected from well-established lysimeters across many climatological regions using Hargreaves equation with PM equation on weekly, monthly, and yearly basis. He discovered that the Hargreaves equation indicated a very similar ET o result to that of the PM equation. In addition, Droogers and Allen (2002) showed a significant correlation between the monthly estimates of ET o using HS and PM equations. However, an insignificant correlation was noted between HS and PM methods when estimated on daily and weekly basis for five different climatic regions except at semi-arid area (Stockle et al. 2004). A more extensive comparative study between Hargreaves and the PM method in Spain that covered 86 synoptic stations, most of which is under the semi-arid regions, concluded that there is a need for regional calibration of the Hargreaves method using temperature and wind data (Gavilan et al. 2006). In another similar study carried out in Iran by Fooladmand and Sepaskhah (2005), a calibration of the Hargreaves equation was done for estimating the monthly ET o in Bajgah area in Fars province, which is a semi-arid region in Iran based on the PM equation. Also, a study by Heydari and Heydari (2014) in arid and semi-arid areas of Iran showed that the original coefficient value (0.0023) of the Hargreaves equation ranged between 0.0013 and 0.0043. Although some ET o models have been calibrated for many countries in the world, for example, in Iran (e.g., Valipour 2015; Heydari and Heydari 2014), China (e.g., Zhu et al. 2019;Gao et al. 2015), Poland (e.g., Bogawski and Bednorz 2014), USA (e.g. Thepadia and Martinez 2012), and Southeast Australia (Azhar and Perera 2011), paucity of data and lack of sufficient information in terms of suitable ET o model to be applied for the Northern Region of Nigeria (NRN) are posing a serious challenge to the limited water resources.
The NRN is located between latitude 6 and 14°N of the West Africa coast covering some part of Sahel, Sudan, and Guinea savannahs. The Sahelian region cuts across several countries in western and central Africa. For example, in the Senegal River Valley (SRV), the major produced crop is rice and the water resources that can be applied for irrigation is becoming very scarce and expensive (Djaman et al. 2015). Like in the NRN, SRV, and many other Sahelian regions, efficient management of water has not been considered as very important; therefore, irrigation schemes are poorly developed (Djaman et al. 2015). In SRV, it is noted that irrigation requirement is the main factor to be considered for the cost of rice production. This is due to the fact that the cost of pumping irrigation water is very expensive, thereby making 30% of the total cost of operation (Comas et al. 2012). In order to limit cost or efficiently manage water resources, there is a need for accuracy in the determination of irrigation water requirement in the SRV, NRN, and other Sahelian regions.
Some notable studies carried out in Nigeria as regards the estimation of ET o include, e.g., Akpootu and Iliyasu (2017), Ejieji (2011), andAdeboye et al. (2009). All these studies succeeded in comparing many ET o models with the FAO-PM 56 without actually calibrating or developing a localized ET o for any region in Nigeria. To compensate for this and other identified gaps, the aims of this study are (1)

Description of the study area and data collection
The NRN covers about 75% of the total land mass of Nigeria (925,796 km 2 ). This region has two seasons: the dry and the wet seasons. The dry season ranges from 4 to 7 months depending on the climatic zone. Rainfall distribution within the NRN ranges from about 400 to 1500 mm depending on the latitude of the location under consideration (Oguntunde et al. 2011). The monthly climatic variables of 19 stations which include relative humidity, minimum and maximum temperatures, wind speed, and sunshine hours utilized for this study spanned through January 1981 to December 2015. The relative humidity and minimum and maximum temperature parameters were obtained from Nigeria Meteorological Agency (NIMET) headquarters, Abuja, Nigeria. However, since wind speed and sunshine duration datasets were not readily available in most of the stations during the study period, bias-corrected datasets were downscaled from ERA-interim reanalyzed data (Berrisford et al. 2011). For the present study, the reanalyzed sets were used because they had strong correlation coefficient (> 0.8) with observed data as indicated by Oluleye and Adeyewa (2016) using some meteorological station datasets in Nigeria. The wind speed data utilized for this study consist of monthly u wind and v wind components. Therefore, wind speed magnitude at 2 m height (U 2 ) was calculated using Eq. (1): The quality of wind speed, relative humidity, sunshine hours, and solar radiation was verified using the methodology suggested by Allen et al. (1998). The solar radiation (R s , MJ m −2 day −1 ) was computed using Angstrom equation, which links extraterrestrial radiation and sunshine hours to solar radiation: where a s and b s are regression constants, which indicate the fraction of extraterrestrial radiation reaching the earth on clear days. The values adopted for a s and b s are 0.25 and 0.50, respectively.
nis the actual duration of sunshine hours.
Nis the maximum possible of sunshine or daylight hours. R a is the extraterrestrial radiation (MJ m −2 day −1 ). The N and R a were determined using the following equations: where. G sc 0.0820 MJ m −2 min −1 . d r is the inverse relative distance of earth-sun.
s is the sunset hour angle (rad). (1) where J is the number of days in the year between 1 January and 31 December. Corresponding J values for all the days of the year can be estimated using an algorithm in Table 2.5 Annex 2, FAO 56 paper (Allen et al. 1998).
To ascertain the quality of relative humidity data, actual vapor pressure e a was computed based on the assumption that the dew-point temperature is similar to the minimum temperature (T min ) values, there using the following equation (Allen et al. 1998): The summarized detail of these variables and other station parameters are contained in Table 1, while Fig. 1 shows the location of each station in the study area (NRN) and Fig. 2a-c shows the long-term monthly variation temperature, wind speed, and sunshine hour datasets.

Methodology
The two models used in estimating ET o for all the 19 stations of the study area are FAO-56 PM and HS (1985) equations.

PM model
In the present study, FAO-56 PM model which has been generally referred and accepted as a standard model for estimating ET o was adopted for calibrating HS ET o model. This equation has been used to estimate ET o for so many regions and climate because of its versatility (e.g., Jerin et al. 2021;Ogunrinde et al. 2020b;Heydari and Heydari 2014;Ghamarnia et al. 2012). The FAO-56 PM general model for estimating ET o on a daily basis is given in Eq. (9) (Allen et al. 1998): where ET o (mm day −1 ) stands for reference crop evapotranspiration; ∆ (kPa °C −1 ) represents the slope of the saturation vapor pressure; R n (MJ m −2 day −1 ) stands for net radiation term; G (MJ m −2 day −1 ) is the soil heat flux term; T (°C) stands for average air temperature; U 2 (m s −1 ) represents wind speed taken at the height of 2 m; e s − e a (kPa) represents the vapor pressure deficit in the air; (kPa °C −1 ) stands for the psychometric constant.
Very few of the synoptic stations were sited at the airport where an automatic weather station has the capacity of recording wind speed, temperature, relative humidity, and radiation datasets on an hourly or daily basis. However, most of the other synoptic stations have facilities such as standard rain gauges, wind anemometer, and air thermometers. The detailed procedure for estimating all the parameters in the

Hargreaves-Samani (HS) model
HS was developed considering the fact that there are regions or areas where there will not be adequate climatic/weather parameters needed to estimate the original Penman equation or other available ET o models (Hargreaves and Samani 1985). The 1985 version of the HS ET o is presented in Eq. (10).
where ET o (mm day −1 ) represents the potential evapotranspiration estimates; R a (mm day −1 ) stands for the extraterrestrial radiation which usually requires the day of the year and (10) latitude; T max andT min represent the maximum and minimum air temperatures; the default C and E values are 0.0023 and 0.5, respectively, which represent empirical constants for the original HS model mostly conducted in an arid climatic region.

Local calibration for Hargreaves-Samani model
Excel Solver can be used to obtain a prime value for an objective function in a target cell. It is a tool that also depends on a number of cells that are connected to the objective function in the target cell, and it utilizes the values in the changeable cells to adjust the value of the target cell. Zhu et al. (2019) indicate that C and E parameters in HS model is a problem that can be fixed using a nonlinear optimization means. For this research, Excel Solver has been identified as a means to optimize these parameters. For the parameterization of HS model, there are two approaches that can be used in Excel software. The approaches are the Generalized Reduced Gradient (GRG) solver (Lasdon and Smith 1992) and the evolutionary solver (Barati 2013). The GRG approach was adopted because of its universality and acceptability in similar studies (Zhu et al. 2019). More detailed explanation on the optimization algorithms of Excel software using solver can be found on Premium Solver Platform (2010). The monthly ET o values computed using HS and FAO-56 PM models were divided into two periods, which covered for calibration and validation distinctly. Twenty-seven and eight years were used for the calibration and validation phases, respectively. During the calibration phase, the FAO-56 PM method was used for calibrating parameters C and E of the HS equation using the Solver tool in Microsoft Excel 2016 predicated on the nonlinear least squares technique that is targeted to reduce the residual sum of squares of the estimated ET o from the FAO-56 PM and HS equations. After completing the calibration for all the 19 meteorological stations considered in the present study, the calibrated parameters (C and E) were used to replace the original C and E values in the original HS equations in the next 8 years of the meteorological data in order to substantiate the calibrated HS equation. The statistical assessment carried out on the HS equation was predicated on the results obtained from the validation period.

Aridity index
Aridity or climate index can be defined as a pointer for the intensity of dryness of the climate of a region/area (Gocic and Trajkovic 2013). For this work, the UNEP index (UNEP 1992) was adopted to serve as an indicator for the climate classification of all the 19 meteorological stations. The classification was based on the ratio of rainfall (R) to reference

Spatial analysis
To show the geographical spread of the calibrated C and E parameters, the root mean square error (RMSE), mean absolute error (MAE), and correlation coefficient (R) within the study area, ArcGIS 10.4 software desktop edition was used.
To illustrate the spatial maps, the inverse distance weighting (IDW) interpolation technique was utilized to convert points to spatial datasets. The IDW presumes the unmeasured points by estimating it from the surrounding nearest measured points. The equation for estimating the unmeasured points is as follows (Zinat et al. 2020): where x o is the inference point and the x i are the sampled points within a study area. The weight parameter (r) is linked to distance by d ij , which is the distance between the inference point and sampled points.
The rationalization behind using the IDW is its wide usage in comparison with other interpolation models like natural neighbors, spline, and kriging (Zinat et al. 2020).

Statistical assessment
In order to evaluate the error margins between original HS and FAO-56 PM, and calibrated HS and FAO-56 PM, some statistical tests carried out include root mean square error (RMSE), mean absolute error (MAE), and coefficient of correlation (R). All these tests were done for all the stations using Eqs. (11) to (13).
where n stands for the number of data in months; H i represents the original or calibrated HS ET o models; P i represents the ET o estimated from FAO-56 PM.   (Fig. 3a, b). The correlation coefficient (R) values within all the four climatic regions indicated from the study area ranged between 0.56 and 0.87, with an average value of 0.73. The arid region recorded the highest (0.87) while the least is noted at the sub-humid region (0.56). The reduction in the correlation coefficients from the arid to the sub-humid region may be due to the role of relative humidity. This is in line with the observation of Zhu et al. (2019). Judging by the correlation coefficient, the precision of the ET o computed from the original HS equation in the arid climatic region is relatively higher than other climatic regions. In line with the findings of some researchers (Zhu et al. 2019;Almorox and Grieser 2016), the windy climate as a result of the closeness of Sokoto meteorological station (arid region) to the Sahara Desert may be the influence responsible for the considerably higher R between the original HS model and FAO-PM model than other climatic regions (i.e., semi-arid, sub-humid, and humid). However, the original HS model with the recommended parameters (C and E) was not consistently able to give reliable results in most parts of the NRN. Therefore, the local calibration of these two parameters under different climate regions is essential to derive better precision for HS ET o model.

The estimation of C and E parameters of the HS equation
The spatial distribution pattern for both C and E parameters in the HS equation for the NRN are presented in Fig. 4a and b. The parameter C ranges between 0.0031 and 0.0096 with an average value of 0.0047. Although there is no observed significant trend within the four climatic regions, all the stations within the arid and semi-arid regions indicated a C value greater than the average value except for Bauchi meteorological station (0.0035). The average calibrated values for parameter C in arid, semiarid, sub-humid, and humid climate regions are 0.0055, 0.0050, 0.0046, and 0.0040, respectively. These observations indicate that the parameter will change under different climate regions. This is similar to some earlier studies (Zhu et al. 2019;Pandey et al. 2014). The C parameter was clearly more than the recommended value (0.0023) in all the climate regions. This shows the weakness of this parameter in the estimation of an accurate HS model. The high spatial distribution pattern illustrated by Fig. 4a shows that the correction of parameter C is essential for the optimum estimation of ET o within the study area. In support, some earlier studies indicated that there is always a large variation in this parameter in different areas, even within a climatic region (Zhu et al. 2019;Xu et al. 2013). Figure 4b reveals the spatial distribution pattern of parameter E within the NRN. The figure shows that the values of the corrected parameter E do not have a specified trend or pattern within the study area. However, the highest parameter E value is noted at Zaria (0.41) and the lowest at Gusau (0.04), both located in the sub-humid climatic region. Observations from Fig. 4a and b are similar to Zhu et al. (2019) in some provinces in China. The outline/curve line of parameter E close to 0.5 was indicated mostly in the central and southeastern regions of the NRN, which are mostly located in the humid/sub-humid regions. Regions within the arid and semi-arid climatic zones show values far from the recommended value. Although there are some agreements among most studies carried out on the calibration of HS model, there is still an indication that the original parameters C and E should not be fixed for any climatic region. Some of this disparity might have been caused by the method used in the calibration process. Therefore, there is a need to synergize the whole calibration process for a more robust application of the HS equation.

Comparison between the calibrated HS and PM models
The comparisons between the calibrated HS and FAO-56 PM models using MAE, RMSE, and R are illustrated in Fig. 5a-c. The error values indicated by MAE and RMSE in Fig. 5a and b were smaller than that of Fig. 4a and b. The smallest MAE and RMSE value occurred at Ilorin station, located in the sub-humid climatic region. The R value increased from the southern part of the study area upward north. After calibration, there is an increase in the R values recorded in the NRN at almost all the 19 stations. The sub-humid climatic region experienced the highest improvement in the R value when compared with other climatic regions. Therefore, the outcome from the statistical parameters among the climatic regions show that the precision of the calibrated HS performed better at the sub-humid region than other regions. In general, the calibrated HS model did satisfactorily in the NRN, except in the humid climatic region where there was a slight reduction in the R value (i.e., Jos). This observation is similar to the findings of Zhu et al. (2019) and Pandey et al. (2014).  It could also be due to the influence of the high variations in the relative humidity and wind speed during the Harmattan period that is more prevalent in the arid and semi-arid regions. Since these two important climate parameters (wind speed and relative humidity) are not considered in the development of HS equation, it is, however, necessary to recalibrate the HS equation using the prevalent climatic conditions of an area. The results illustrated in Fig. 7 show a better fit for all the climatic regions especially for the tropical regions. The observations from these results are similar to that of Zhu et al. (2019) which revealed that the calibrated HS ET o equation for subtropical and tropical monsoon China performed well. Figure 8 illustrates the ET o trends obtained from the HS and PM equations during the study period for four stations representing all the climatic regions in the NRN. The variation in the trend patterns of ET o values was moderately coherent during the calibration and validation phases. Due to the strong winds from the Sahara Desert that is more prominent during the dry season (i.e., usually between the months of November and February), the calibrated HS equation underestimated the ET o values during this season compared with the wet season. The disparity between the ET o values obtained from the calibrated HS and FAO-56 PM models was higher in the tropical regions than the arid and semi-arid climate regions. Relatively, Fig. 9a-d indicates a better performance for the calibrated HS than the original HS, although it was more obvious at Makurdi and Lokoja stations (i.e., sub-humid and humid regions). Figure 9a-d shows the ET o comparison obtained from the FAO-56 PM, original, and calibrated HS models for four representative stations during the study period. The coefficient of determination (R 2 ) for the calibration and validation periods of the calibrated HS equation ranged from 0.48 to 0.78 and 0.52-0.83, respectively, while that of the original HS model ranged from 0.31 to 0.58. For all the four stations representing the climatic regions of the study area, the values of the statistical parameters for both calibration and validation phases are higher than those recorded for the original HS model. Also, the R 2 values decreased for the original HS model from the arid to the humid region, while the calibrated HS model got improved in the same pattern. Apart from the intense wind speed and low humidity that usually characterize arid and semi-arid regions, other factors that may be liable for the changes in the values recorded in various climate regions include location, altitude, terrain, extreme events,  (Almorox and Grieser 2016).
Some similarities could be noted between the four climate classes and the efficiency of the HS equations (original and calibrated) considering Figs. 8 and 9. Figure 9 shows that in the arid and semi-arid climatic regions, the regression lines drawn for calibrated HS model during the calibration and validation phases are closer to the line of best fit than that of the original HS model. Higher intercepts were noted for both original and calibrated HS under tropical climatic regions than for arid and semi-arid regions. This could have been responsible for more scattering of the observation points, thereby leading to a farther move away from the line of best fit unlike in the arid and semi-arid regions. Temesgen et al. (2005) suggested that the ET o values computed from the original HS equation would be lesser under a relative high wind speed and low relative humidity climate and greater under low/moderate wind speed and high humidity conditions. Also, just like Zhu et al. (2019) and Xu et al. (2012) observed over humid regions in China, the original HS model overestimated monthly ET o than the calibrated HS model.
Taking a critical look at the statistical analysis carried out, the calibrated HS equation can be advised more for the sub-humid and humid regions than at arid and semi-arid regions, while the original HS equation can be retained for some stations such as Sokoto, Bauchi, Kano, and Potiskum. At the abovementioned stations, the calibrated HS equation did not outperform the original HS equation. Moreover, even though the efficiency of HS model increased in most of the stations in sub-humid and humid regions, some stations such as Jos, Katsina, Kaduna, and Zaria still need to be improved on. Drawing from the results of the MAE, RMSE, and R for some of the highlighted stations in the humid and sub-humid climates, it will be more efficient to use other methods to estimate ET o values for these stations. The observations noted among some stations that led to lack of improvement in the ET o values could be the prevailing role of the local climate.

Conclusions
This study used the FAO-56 PM ET o equation to modify the C and E parameters in the original HS equation and assessed the efficiency of the calibrated HS equation based on climatic data from 19 stations across four climatic zones in the NRN. The summary of the key findings are as follows: However, the performance can depend also on the temporal resolution of the data considered and some other geographical and climatological features of the area.
This study concentrated on the investigation of local distribution symmetries on a big spatial scale and elucidates the suitability of the HS equation in many climatic regions. The outcomes also make available a vital foundation for the estimation of ET o when there is paucity of data and provide more information for irrigation management. It is, however, recommended that other calibration procedures should be investigated for this study area using daily meteorological station data instead of monthly and a combination of satellite and observational data as used in this study.

Availability of data and materials
The data that support the findings of this study are available from the corresponding author (quoc_bao. pham@us.edu.pl) upon reasonable request.

Code availability
The codes that support the findings of this study are available from the corresponding author (quoc_bao.pham@us.edu.pl).

Conflict of interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.