Evaluation of water quality of Angereb reservoir: a chemometrics approach

Deterioration of water quality of lakes and reservoirs has become major global concerns that impose serious environmental impacts for both aquatic and terrestrial environments. In the current study, many parameters like temperature (Temp), electric conductivity (EC), dissolved oxygen (DO), turbidity (TU), pH, biological oxygen demand (BOD), chemical oxygen demand (COD), total alkalinity (TA), total dissolved solids (TDS), total organic carbon (TOC), nitrate(NO3−), phosphate (PO43−) and chlorophyll a (chl-a) were determined. The study covered the Angereb reservoir and its tributaries on a monthly basis from January to March 2019 at five sampling stations in accordance with APHA 2017 guide lines for physicochemical analysis. The values of all the investigated parameters, except DO (at AU, AD, KU and KD), COD and TU, were below the maximum permissible limits set by WHO. Thus, the findings for DO, TU and COD demonstrated that remedial actions should be taken to improve the quality of the water in the reservoir and its tributaries. Multivariate statistical methods (PCA and CA) were applied to detect spatial and temporal variations of water quality parameter. The first three principal components were enough to develop the PCA score plot which explained about 71.32% of the total variance in the dataset. The PCA and CA have provided similar information; grouped the 24 samples into 3 significant clusters showing spatial variations but minimal temporal variations were observed within the samples collected in the period of January in the reservoir site. The water quality parameters, TU and BOD, were moderately positively loaded on the space of the first principal component and were found to be associated with each other, whereas the EC and TDS have shown moderate negative loading and positively associated with each other. This study suggested PCA and CA methods found to be useful tools for monitoring and controlling water quality parameters for selected sampling stations of surface water.


Introduction
Water resource is potential factor for sustainable growth and development of world economy. Recently, the demand for water resource is rising due to the alarming growth of population and urbanization (Adeba et al. 2015;Nyam et al. 2020;Wang et al. 2022). To meet the increasing demand for drinking water and its domestic use, irrigation and clean energy, many dams have been built in several regions around the world (Khoshkonesh et al. 2022;Mezger et al. 2021;Woldeab et al. 2018). In Ethiopia, the construction of water reservoirs and dams mainly for the purpose of drinking have grown rapidly in cities with high population. Most of the water reservoirs do not meet their proper functions due to domestic sewage, industrial, and farming wastes discharged into rivers (Jiang et al. 2022).
Surface water contamination with toxic chemicals such as heavy metals and pesticides, and excess nutrients, storm water runoff, and effluents have become a major issue across the world (Ahmad et al. 2021;Alshehri et al. 2021). Influx of unwanted pollutants into aquatic systems together with natural sources in the geology, exceedingly associated with both point and non-point sources of pollutants, causes water quality degradation (Guadie et al. 2021;Mengistu 2021;Qiu et al. 2019). Fast population growth that trigger high volume of municipal sewage (Glibert 2017;Lai et al. 2021), industrial untreated effluents and agricultural wastes are among the major contributors to water quality degradation globally (Armstead et al. 2016;Qiu et al. 2019).
Anthropogenic factors directly or indirectly affect water quality in relation to the type and level of developmental activities within catchments (Glibert 2017;Golubkov, andGolubkov 2018 Maliaka et. al. 2020). According to researchers, huge anthropogenic activities closer to domestic water supply areas are considered as potential risks to water quality deteriorations Olokotum et al. 2020;Oliver et al. 2018). Water quality for use and suitability for consumption are determined by its taste, odour, colour, and concentration of organic and inorganic nutrients (Dietrich andBurlingame 2020, Dayarathne et al. 2021;Tesfaye et al. 2021). The presence of inorganic contaminants above the permissible limit in the water can affect the water quality, consequently, disturb the normal functioning of body physiology resulting being lethal (Akinnusotu et al. 2021;Palansooriya et al. 2020;Pigłowski 2018). In developing countries like Ethiopia, where 50% of the population lack sanitation and 80% of wastewater is directly discharged into the nearby aquatic system. There is high risk of water quality degradation due to the low capacity of waste treatment (Alemu et al. 2017;Desye et al. 2021;Gebremedhin et al. 2018;Tadie 2018;Tessema 2017;Tadesse et al. 2018).
Angereb reservoir is the main source of domestic potable water to Gondar city and its vicinity (Haregeweyn et al. 2012;Getachew and Melesse 2012;Zeleke et al. 2013). The water reservoir is surrounded by agricultural fields where fertilizers and pesticides are applied that caused nutrient flow into the reservoir (Ali and Shakir 2018;Gessesse et al. 2009;Zeleke, et al. 2013). Besides, the Angereb watershed has found in the interface between rural and urban areas which makes it more vulnerable to water quality deterioration as a result of land degradation, sedimentation and solid waste disposition (Haregeweyn et al. 2012). However, information related to the water quality of the reservoir is diminutive. Therefore, in the present study, water quality parameters of Angereb reservoir were evaluated with the chemometric approaches such as principal component analysis/factor analysis (PCA), biplot (BP) and hierarchical clustering (HCA) in order to characterize and differentiate the similarities and dissimilarities of water quality parameters in the investigated samples.

Study area
Gondar city is one of the main tourist destination sites in Ethiopia, located about 750 km North of the capital Addis Ababa. Angereb reservoir is constructed in the Angereb watershed on the eastern side of the city between 37° 25′to 37° 31′E and 12° 00′ to 12° 34′ N with an elevation of 2133 m above sea level (Fig. 1). The watershed belongs to the Blue Nile basin having an area of 7654 ha. The topography of the watershed is characterized by hilly areas, ridges, and valleys between hills. The place has mean annual temperature and rainfall of 17-25 °C and 4-311 mm, respectively (Asefa et al. 2013). The main tributaries to Angereb reservoir are Kokoch and Angereb rivers.

Sampling and sample preparation
The water samples collected three times every month in ten days interval from 1st of January to 30th of March 2019 from eight different sampling sites: Angereb river upstream (AU), Angereb river downstream (AD), Kokoch river upstream (KU), Kokoch river downstream (KD) and the remaining four sampling points were from the reservoir itself such as Upper part of the reservoir water edge (UR), spillway or bottom of the reservoir water edge (BR), side Page 3 of 12 103 part of reservoir water edge (SR) and finally from the middle part of the reservoir (Intake, IR).
Each collected sample homogenized in a single container to have one composite sample and filtered through 0.47 µm pore size glass fibre filters (Whatman GF/F) using 300 mL vacuum hand filter and collected in a cleaned (using 10% H 2 SO 4 , rinsed with distilled water) plastic bottles for analysis. The filtered water samples were kept in a cool box till transported to laboratory.
All water quality parameters were analysed in triplicates according to standard testing procedures recommended by American Public Health Association (APHA) standard methods of water examination (APHA, 1926).

Data analysis
All statistical analyses were conducted using SPSS Ver. 23 software. The mean and standard deviation as well as inferential statistics including correlation, and mean comparison tests were employed to assess the relations of physicochemical characteristics of the water in the dam. Cluster analysis (CA) and principal component analysis (PCA) were used to determine the interrelationships among physicochemical parameters which account for the variability of the water reservoir quality, among different sampling sites at various months, and to group the sampling locations according to physicochemical parameters.  Table 1 Limnological physico-chemical water quality parameters of Angereb reservoir and tributaries (Mean ± SD) Temp ( o C), TA (mg L −1 ), TU (NTU), EC (in µS/cm), TDS (mg L −1 ), DO (mg L −1 ), COD (mg L −1 ), BOD (mg L −1 ), NO 3 − (mg L −1 ), PO 4 3− (mg L −1 ), TOC (mg L −1 ), Chl-a (µg/L) 5.00 ± 0.09 7.50 ± 0.06 15.50 ± 0.07 17.00 ± 1.00 12.00 ± 1.00 12.50 ± 0.50 11.00 ± 1.00 13.00 ± 1.00 Mar.

Physicochemical parameters
The spatial and temporal variations of physicochemical water quality parameters from Angereb reservoir and its main tributaries are presented in Table 1.
The temperature values obtained from the sampled water ranged from 16.70 to 26.6 °C, with the lowest temperature value at KU during February and SR has the highest value in the month of March (Table 1). All the values were below the permissible limits of WHO, 30 °C (WHO 2017). The results agreed with the findings reported by Tadie (2018) from the same study area and by Akinbile and Omoniyi (2018) from Nigeria.
The lowest electrical conductivity (EC) recorded in January at KD and highest at AU. The high EC values at AU might be related to extensive agricultural activity around the area and improper waste discharged in the region (Ojekunle et al., 2020). All the EC values were fallen below the permissible standard of 1500 µS/cm set by WHO (2017).
The pH for the water sample was found to be the lowest (7.32) at IR site and the highest at UR (8.00) in the month of January, which were consistent with recorded by Aragaw and Gnanachandrasamy (2021), and Rahman et al., (2021). Relatively higher pH values of the tributaries and reservoir indicate the hydrolysis of HCO 3 − and CO 3 2− and production of OH − (Toufeek et al., 2009). The pH values of all the study areas were within the safe limit of 6.5-8.5 set by WHO (2017).
The dissolved oxygen (DO) values in the reservoir (UR, BR, SR and IR) were relatively lower than values in the tributaries (AR, AD, KR and KD). In the month of February, the lowest (5.10) and highest (10.51) values in mg L −1 of DO were found in SR and AU, respectively. The DO values of AU, AD, KU and KD were slightly higher than the permissible limit of 6 mg L −1 (WHO, 2017). Relatively low amount of DO was observed in the present study which might be ascribed to severe anthropogenic activity in the vicinity as well as the entrance of oxygen demanding wastes from the surrounding tributaries due to photosynthesis activities of aquatic plants (Engdaw and Subramanian 2015).
Turbidity (TU) is caused by suspended solids that influence the extinction of light in the water through reductions of visual depth (Hribalova and Pabst 2020; Wu, et al. 2017). The lowest and highest turbidity was noted in water sample from AU (13.53) and BR (49.20) in the month of January. The TU values of the water at all sampling sites were above WHO value of 5 NTU (WHO, 2017). Higher turbidity values may be due to increase in the discharge of transportation of detached soil particles, sewage, and domestic wastes to the water bodies of the study areas (Akinbile and Omoniyi 2018). Besides, it might also be caused by algae, microorganism, inorganic and organic minerals (Chang, et al. 2020;Griffith, et al. 2020;Liu, et al. 2020;Mekuria et al. 2021;Zimale et al. 2018). This observation is consistent with earlier researches on the turbidity of Harsool-savangi Dam, India, (Shinde, et al. 2011) and Bangladesh (Rahman et al. 2021) with values higher than the WHO limit (WHO, 2017).
Total dissolved solid (TDS) is an indicator of water pollution, which originates from sewage, natural sources, urban runoff and industrial waste water (Kharake and Raut 2021). The highest and the lowest TDS measured in the reservoir water and in tributaries were 251.20 mg L −1 and 158.50 mg L −1 at AU and KD, respectively, for the samples collected during January. The relatively higher TDS at AU and AD values may be due to excessive accumulation of sewage water and sludge from various areas of the catchment (Jaybhaye et al. 2022). The TDS values were below the permissible limit of 600 mg L −1 given by WHO for all water samples (WHO, 2017). The values were comparable to that reported for some dams in Saudi Arabia (Albaggar 2021), Saudi Arabia (Musa et al. 2014).
In the present study, the BOD ranged 9.30-19.70 mg L −1 , the lowest at UR and the highest at KD samples during January and February, respectively. This indicates the presence of a high organic load in the river at KD, and the river water quality is deteriorating compared to the other sampling sites.
The BOD values are consistent with those reported by Rahman et al., (2021) in Bangladesh. All the recorded BOD values for all water samples were above the permissible limit given by WHO, 5 mg L −1 , (WHO, 2017) which indicates the presence of large quantities of organic materials in the water (Rahman et al. 2021).
Similarly, the COD values were highest at KD (56.00 mg L −1 ) during March and the lowest at UR (9.13 mg L −1 ) during January, which are in good agreement with the reported values by Khan and Wen (2021), values ranged from 8 to 51 mg L −1 and by Akinbile and Omoniyi (2018) 18 to 49 mg L −1 . Higher value of COD beyond the permissible limit set by WHO (10 mg L −1 ) might be due to the discharge of chemicals and organic fertilizer, and the discharge of municipal effluent (Al-Badaii et al. 2013).
The total alkalinity (TA) values of the water were found in the range of 71.00-128.20 mg L −1 . The highest value of TA in the samples was at AU during January, whereas the lowest was at SR during February. Relatively higher values of alkalinity at AU for all months compared with other sites may be an evident of the presence of high amounts of HCO 3 − , CO 3 2− , OH − ions, etc. in the water (Khan et al. 2020). The mean values of TA in both tributaries and reservoir were comparable with literature report (Deemer, et al. 2020). The values were higher than reported from Nigeria (Adesakin et al. 2020) and borehole water from Ghana (Boadi et al. 2020). All the values obtained at all study sites were found to be below the WHO guidelines (500 mg L −1 ) for drinking water (Boadi et al. 2020).

Concentrations of nutrients from Angereb reservoir
Concentrations of nutrients determined from surface water samples of Angereb reservoir and its main tributaries are presented in Table 1).
The concentration of NO 3 − varied from 0.05 to 0.95 mg L −1 across the studied sites. The highest concentration of NO 3 − was found in AD sampled during February, which was 19 times higher than the lowest value detected from AU sampled during January. The NO 3 − concentrations were relatively lower than results reported from different Ethiopian water bodies. Asefa et al. (2013) reported a value of 0.5 mg L −1 from Angereb reservoir; Wassie and Melese also reported NO 3 − concentrations ranged from 0.1 to 2.0 mg L −1 from Selameko man-made Reservoir (Wassie and Melese 2017). Similarly, its level lower than the values reported from Greece (Chamoglou et al. 2018), Portugal (Palma et al. 2014) and from Ghana (Boadi et al. 2020). The values are comparable with reports for some dams in Saudi Arabia (Albaggar 2021), Ondo State, Gilgel Gibe Reservoir, Ethiopia, concentration ranged from 0.62 to 0.69 (Woldeab et al. 2018), .32 mg L −1 ) of water reservoir reported from Nigeria (Adesakin et al. 2020), and Garra River, India, with NO 3 − values ranged from 0.03 to 3.79 mg L −1 (Khan and Wen 2021). However, the concentrations of NO 3 − in this study were relatively higher than reported by Alsalme et al. (2021) from Pakistan. The concentrations of NO 3 − obtained were lower than the WHO threshold values of 50 mg L −1 for drinking water.
The levels of PO 4 3− in water samples were ranged from 6.37 to 26.75 mg L −1 . The lowest and the highest values were obtained AU during September and IR during March, respectively. These values were comparable with the results reported by Tefera et al. (2021), which ranged from 4.7 to 17.2 mg L −1 . All the recorded PO 4 3− values were fell below the permissible limit set by WHO (250 mg L −1 ). The PO 4 3− level was higher than the reported values from Ethiopia (Wassie and Melese;, Algeria (Saal et al., 2021) and Pakistan (Alsalme et al. 2021). However, the results of this study were comparable with previous study from Angereb reservoir (27.8-29.0 mg L −1 ) (Tadge et al. 2021).
Monitoring total organic carbon (TOC) is a very important parameter for evaluating drinking water and wastewater qualities, and determining the degree of pollution present in the drinking water (Pandey et al. 2021). The TOC values of the dam and reservoir were ranged from 4.00 to 41.00 mg L −1 . As per Table 1, lowest concentration of TOC was found in the sample of AU and the highest TOC observed in the sample of BR collected in January. These observations were consistent with earlier research conducted by Pandey et al. 2021, where the TOC values of drinking water were ranged from 0.71 to 98.57 ppm and reported by Wu et al. (2017), valued from 0.60 to 47.3 mg L −1 in surface water.
Besides nutrient loads, chl-a, is among the indicators of the severity of eutrophication and algal bloom of water body. In this study, the highest chl-a concentration was recorded in AD (6.91 μg/L) on March and the lowest was recorded in IR (3.89 μg/L) in the month of January. In all study sites at all months, the chl-a concentrations were less than the WHO threshold value (10 µg/L). These values are supported by the data reported by Hu et al (2012). However, they are lower than most of the values reported by Zou et al., (2020) and Liu et al (2012).

Multivariate statistical analysis of spatial variation of water quality parameters
Principal component analysis (PCA) provides information on the most significant variables by reduction of high dimensional data with minimal loss of information. PCA was applied to extract the most significant principal components (PCs) and reduce the contribution of variables with the least significance. In this study, the multivariate dataset comprised 13 physicochemical variables and 24 water samples collected from 8 sampling locations (rivers and reservoir) in three sampling periods (January to March). The PCA was performed on standardized variables to eliminate the effect of different magnitudes and scales of measurement units on the determination of factor loading (Nosrati 2015). The PCA score plot shows that the percentage of total variance explained by the first three principal components (PCs) are 34.3, 25.7 and 11.32, respectively, which accounts for a cumulative explained variance of 71.32%. It is necessary to use 2-3 main components for robust multivariate analyses and the number of prominent principal components were retained based on the Kaiser criterion having eigenvalue greater than 1 (Pinheiro-Sousa, et al. 2021). The scree plot (Fig. 2) is used to identify the number of PCs. The loadings of the variables and correlation between variables and the PC scores are indicated in Table 1.
The PCA score plot (Fig. 3) shows the distribution of physicochemical parameters for water sampled in the sampling locations in the axes of two PCs. The score plot showed three different clusters in which the samples collected from Angereb River upstream and downstream locations that has been taken in three periods (January to March) were grouped as cluster A and are negatively correlated with PC1. Similarly, Kokoch River samples has also shared similar physicochemical patterns (Cluster B) and positively correlated with the first PC. The sampling locations from the reservoir sites makeup similar patterns are grouped as Cluster C. The data projection on the space of the second PC showed a correlation between the samples collected in the period of January in the reservoir sampling sites. However, the preliminary conclusion from these exploratory PCA was The normalised object scores and variable loadings on each PC were scaled proportionally to the root of the variance accounted for by that PC as shown in the plot (Fig. 4). The biplot score plot showed that the first PC has fairly moderate positive loadings for turbidity, PO 4 3− and BOD respectively, whereas it has relatively moderate negative loadings for TDS and EC. The second PC consist moderate positive loadings for COD, Chl-a, BOD and DO, while moderate negative loading with TOC (Table 2). Moreover, the variables (TDS and EC) were found to be highly correlated, and their contribution was significant to grouping the samples collected from Angereb river upstream and downstream sampling locations, whereas PO 4 3− , BOD and DO were found highly significant to group samples of Kokoch river sampling locations. BOD, TOC, turbidity and PO 4 3− showed positive correlation to each other but negatively correlated with TDS and EC when the data are projected on the space of the first PC. The second PC shows a strong correlation between the samples collected during the period of January in the reservoir sampling locations (Fig. 4). Therefore, this analysis helps to reduce the number of spatial sampling and only need to use the most influential variables to control the water quality.

Cluster analysis
Hierarchical cluster analysis (HCA) is applied to detect similarity by measuring the Euclidean distance among each pair of sampling locations in terms of the measured physicochemical parameters and then to group the sampling locations which are close together. The Euclidean distance gives the similarity between two samples or groups of samples.
The resulting groups of samples should exhibit high internal homogeneity (within a group) and high external heterogeneity (among groups), where grouping is typically illustrated with a dendrogram (Prieto-Amparán, et al. 2018). The dendrogram shows the sampling locations grouped into three significant clusters as illustrated in
PCA and CA were carried out to identify the spatial and temporal variations and the most influencing physicochemical parameters to control the water quality. As a result, the PCA and CA study revealed a significant difference in water quality among groups of sampling locations (Angereb River, Kokoch River and Angereb Reservoir) and temporal variations within a group of samples collected during the period of January to March in the reservoir site. The physicochemical parameters (turbidity, phosphate, BOD, EC and TDS) having moderate magnitude of loading factor can only be used as prominent parameters to control and monitor the water quality. Though majority of the parameters were below the WHO recommended levels, some of the parameters could not be ignored as their levels were above WHO recommended limits and required immediate follow-up by responsible bodies so as make remedial action for its mitigation.
Funding This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.
Data availability All the generated data were included in this article.
Code availability Not applicable.

Conflict of interest
The authors declared that they have no conflict of interest.

Ethical approval Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.