Introduction

Rapid increase in population along with the inevitable climate changes have stressed the global water resources. According to the United Nations World Water Development report, the amount of water use is raised by a factor of 6 in past 100 years and continues to increase by 1% every year (UN-Water, 2020). In most of developing countries, groundwater is the most common resource which fulfills the essential needs such as drinking, bathing, and agriculture. Besides a basic need, water quality is the most concerning factor for water supply authorities as it is associated with the health and well-being of consumers (Karimi et al., 2020; Organization et al., 2019). If the quality assessment of water shows the biological, physical, and chemical parameters within the permissible range of World Health Organization (WHO guidelines), the water is safe to drink and often referred as potable water (Majeed et al., 2018).

In developing countries, researchers have highlighted the issue of drains which carry untreated industrial effluents (Amano et al., 2020; Onwuka et al., 2004; Rashid et al., 2018). The lack of maintenance and poor management of these drains result in effluents infiltration to the surrounding ground which severely affect the ground-water quality (Rashid et al., 2018). Industrial waste management is one of the major challenges affecting the groundwater quality (Amano et al., 2020). One of such industries that potentially pollute the groundwater is the tannery (leather) industry. Tanneries consume a significant amount of water along with various kinds of toxic chemicals in processing the animal skins. For instance, if a small tannery has a processing capacity of approximately 4 tons, it may need more than 100,000 l of water per day which is roughly equal to the volume of water required for daily household purposes of at least 2500 people. The wastes generated from tanneries have radiant color, high BOD and pH, and high values of TDS. Chromium is extensively used during the tanning process which is considered as a crucial pollutant (Sajil Kumar & James, 2019).

According to some of the researchers, exploiting tannery effluent for irrigation purpose was once considered favorable because of higher load of plant nutrients (Keraita et al., 2008; Perulli et al., 2021; Sumner, 2000). However, excessive and long-term usage of tannery effluent can be catastrophic, as heavy metals will steadily accumulate into the groundwater and exceed the permissible limits. For example, Dotaniya et al. (2017) conducted the study in the Kanpur-Uttar Pradesh, India (that is also known as “Leather City” for having largest tanneries of India). In their study, 20 samples of effluents were collected from different tanneries of Kanpur and similarly groundwater was collected from 12 areas irrigated by tannery effluent. Very high values of EC were recorded in the effluents which further revealed that tannery effluents were highly toxic in nature because it had high values of mean concentration of heavy metal. Moreover, Cr was exceeding the allowable limits of Cr set by the Uttar Pradesh Pollution Control Board (UPPCB). However, mean heavy metals’ content in the groundwater samples were within the permissible limits except 2 samples which had Cr concentration higher than the allowable limits. It was concluded that using contaminated water for irrigation purpose was building up toxic heavy metals especially Cr in the crops hence, ultimately adversely affecting entire food chain, health, soil, and groundwater water quality.

Another research to assess the possible contamination of groundwater due to leather industries was conducted in South India Karunanidhi et al. (2021). In this study, 35 groundwater samples were analyzed on the basis of pH, TDS, EC, flouride (F-), nitrate (NO\(^{-}_3\)), and Cr. Their results revealed that amount of nitrate ranged from 3 to 81 mg/l with an average of 48.6 mg/l. As per WHO, the drinkable limit is 45 mg/l for nitrate. This means that about 57% of the samples exceeded this permissible limit. The concentration of F- varied from 0.1 to 2.7 mg/l with an average of 1.5 mg/l. As per WHO, the drinkable limit is 1.5 mg/l for F-1 and approximately 51% of the wells exceeded these permissible limits. Similarly, the concentration of Cr varied from 0.01 to 0.19 mg/l with an average of 0.1 mg/l. As per WHO, the drinkable limit is 0.05 mg/l for Cr and approximately 66% of the samples exceeded these permissible limits. The spatial mapping of the study area indicated that about 271.76 km\(^2\) area is under risk due to presence of high concentration of Cr in groundwater. The research further depicted that as per total hazard index (THI) of the groundwater samples, 66%, 46%, and 43% of the collected samples have THI values above allowable limits for children and as well as for adults (Karunanidhi et al., 2021).

Economy of Pakistan is largely dependent on the leather industry. With regard to GDP growth, the leather industry plays a very important role in the economy of the country. More or less, 500,000 people are directly and indirectly earning due to this sector (Desk, 2019). There are various small- and large-scale tanneries in Pakistan but the major ones are established in almost all of the metropolitan cities including Karachi, Lahore, Kasur, Sialkot, Gujranwala, Peshawar, and Multan. Despite the fact that it is one of the largest contributors in the national economy of Pakistan, the waste produced by this industry is poorly managed, causing environmental hazards and health related issues (Hashmi et al., 2017). Due to release of industrial effluents without proper treatment into the waterbodies, it has been reported that water quality of Lahore, Karachi, Faisalabad, Peshawar, Sialkot, Gujrat, and Sheikhupura is deteriorating rapidly. Leather processing which is considered as one of the most important activity is concentrated in Sialkot, Kasur, and Karachi where 600 tanneries are working. Effluents discharged from tanneries are toxic and leads to various chronic water borne diseases. Kasur is considered a hub when it comes to industrial activities. It is a city which is located in the central Punjab of Pakistan. Higher number of tanneries are situated in this region which are the source of approximately 9 million liters of contaminated wastewater. Due to this reason, Kasur is marked as one of the most critically affected cities in terms of groundwater contamination (Anwar & Anwar, 2020).

Studies show that, at present, nearly 230 tanneries are polluting the groundwater reservoirs in Kasur (Ashraf et al., 2010; Umar, 2017). The government of Pakistan and the United Nations Development Program (UNDP) signed the Kasur Tannery Pollution Control Project document in 1996 to cope with the pollution caused by tanneries. The main goals to set up the Kasur Tanneries Waste Treatment Plant (KTWTP) were (i) to treat the tanneries waste, to implement the low waste leather processing, (ii) to recover the chromium, and (iii) to improve the occupational health. Due to the malfunctioning of KTWTP, the untreated tanneries effluents are being discharged directly into the Pandoki drain that ultimately releases the untreated wastewater to Sutlej River. In many instances, the untreated wastewater is being used to irrigate crops. As a result, food grains and edible crops are being contaminated and are unfit for human consumption. The untreated tanneries’ effluents have also deteriorated the groundwater quality through seepage as the Pandoki Drain is unlined (Umar, 2017). The contaminated water in the Pandoki drain has resulted into many fatal diseases among the native people such as cancer, kidney failure, eczema, hepatitis, and gastroenteritis. Apart from this, animals are also being affected by this contaminated water as many diseases outbreak in them by consuming this water (Ashraf et al., 2010).

In summary, inhabitants of Kasur are drinking shallow groundwater that has been contaminated due to the infiltration along the untreated effluents carrying drain from tanneries. The aim of present study is to highlight the present condition of groundwater by estimating the physio-chemical properties of groundwater in the vicinity of Pandoki drain. The estimated parameters are compared with their respective permissible limits, issued by PEQSDW. The cross-correlation analysis between various parameters is also performed to determine the relation between various chemical properties. Moreover, since the cross-correlation analysis results a one-to-one relation between various parameters, a high-dimensional analysis of the data is performed using the principal component analysis technique. Finally, the cluster analysis is performed to identify the most vulnerable areas.

Material and methods

Study area

The area selected for study is Kasur, which is situated in central Punjab, Pakistan. This city is about 55 km away from Lahore city, that is the capital city of Punjab province (PMDFC, 2008) and lies between \(30^{\circ }.5^\prime - 31^{\circ }.2^\prime\) N and \(74^{\circ }.10^\prime - 74^{\circ }.50^\prime\) E. Figure 1 shows the map of Pakistan along with the map of Kasur city wherein groundwater sampling locations are indicated by black dots and pandoki drain. Pandoki drain, situated in Kasur, is the open channel that receives domestic and industrial discharges. The drain has become a health hazard for humans, as it is degrading the environment badly because of odor and nuisance (Umar, 2017). The drain has some other names such as Pandoki seepage drain, and Rohi nullah, which ends up in the Sutlej river (Brindha & Elango, 2012). The area of the study was defined by moving radially and longitudinally along this untreated industrial (tanneries) effluent carrying drain.

Fig. 1
figure 1

Sample collection points on geospatial map

Lithology and LULC of the area

The study area has a fluvial terrace with the highest point being 209 m from the mean sea level. Most part of the lithology is a remnant of River Beas and Sutlej and contains fluvial deposits such as mixture of silt and clay, sand, and some portion of gravels (Hayat, 2003; Jalees et al., 2021). The land-use-land-cover (LULC) of the study area shows that around 86.4% of the land is used for irrigation purposes (GOP, 2014). The remaining land is distributed as follows: 4.8% of built-up, 2.3% for forest, 1.9% for wet areas, and 1.2% for natural vegetation in wet areas.

Sample collection

Prior to sample collection, several visits were made to the government and private sector organization to collect the general information related to the drinking water quality and sanitation conditions of the area. The visited organizations are (i) Public Health Engineering Department (PHED) Kasur, (ii) Environment Protection Department (EPD) Kasur, (iii) Pakistan Council of Research in Water Resources (PCRWR) Lahore, and (iv) Water Regime Pvt. Ltd, Lahore. A few visits were also made to the tanneries to observe and carefully understand their treatment procedures.

The sample collection strategy was defined by moving radially and longitudinally around the Pandoki Drain. Sampling points were marked starting from the Pandoki drain and moving till River Sutlej. The source of samples is hand pumps, motor pumps, and tube wells installed at various depths in the area. Sample selection was based on determining the TDS and EC measurements on-site. A total number of 54 samples were collected from 18 different locations (i.e., 3 samples from each location). The sample collection points include potable water from hand pumps, motor pumps, and tube wells in the residential areas or sites surrounding the Pandoki Drain. In Fig. 1, sampling points are marked starting from Kasur tanneries waste carrying drain (Pandoki Drain) and moving radially and longitudinally towards River Sutlej. All the sample bottles were labelled with the details of water samples such as site location, bottle volume capacity, source, and depth of the source. Moreover, a code number was allotted to each sample bottle according to the name of sites.

List of sampling locations

Figure 1 presents the groundwater sampling sites/points situated along the Pandoki drain. There were total 18 sampling points from where groundwater sampling was done to check the quality of groundwater by moving along the Pandoki drain.

Sample preservation, transportation, and testing procedure

The bottles were washed with nitric acid before the sample collection and the samples were preserved in an ice box at 4 \(^\circ\!{\text {C}}\) just after the collection. The sample preservation, transportation, and tests were done according to the protocol and procedures mentioned in Federation and Association (2005).

Parameters measured

In total, 10 parameters are measured from the collected samples. In total, 10 parameters were estimated from the collected sample from. Among the 10 parameters, total dissolved solids (TDS), pH, turbidity, and electrical conductivity (EC) were measured on-site. The remaining 6 parameters were measured in the laboratory which include (i) total hardness (TH); (ii) total alkalinity (TA); (iii) nitrates (NO\(_3\)); (iv) sulfates (SO\(^{2-}_4\)); (v) fluoride (F); and (vi) chloride (Cl).

Statistical analysis

From each location, three samples were collected and examined. For each parameter, the average of three observations is determined and presented in Fig. 2. Figure 2 represents the box plot of the averaged measured parameters wherein each box represents the distribution of each parameter, normalized with respect to its permissible limit given by PEQSDW. Note that the PEQSDW standards are similar to WHO standards for the studied parameters in this paper. The vertical red line in each box represents the median. Note that if the median is higher than 1, it means that presence of that parameter is most of the sites is higher than the permissible limit. The lower and upper edge of the box represents the 25 and 75 percentiles, respectively, the dashed line corresponds to the whisker height and the symbol “+” denotes the extreme value in a given data set (Table 1). The corresponding data of Fig. 2 is provided in Table 2. Figure 2 shows that there are three parameters that are beyond the permissible limit, which include total alkalinity, pH, and EC. Moreover, there are a number of parameters which have researched their critical levels such as TDS, fluoride, and total hardness.

Fig. 2
figure 2

Box plot analysis of the estimated parameters

Table 1 Summary of the tested parameters with their permissible limits

Figure 3 shows the spatial distribution of estimated parameters. It is found that 6 out of 18 samples had TDS values higher than the recommended value by PEQSDW, i.e., \(<1000\) mg/l. This means that 33% of the people in the study area are consuming brackish water. The excessive use of salts by tanning industries is the main reason for higher values of TDS (Zhang et al., 2019). A direct relation has been observed between electrical conductivity (EC) and TDS, i.e., higher TDS values indicates higher EC values and vice versa as shown in Fig. 3i, j.

Furthermore, the analysis reveals that 50% samples have EC values greater than the prescribed limits by the standards. As far as human health is concerned, there is no direct impact associated with TDS, EC, Total Alkalinity, and pH (Mann et al., 2007; Organization, 2011). Since turbidity of all the samples collected is less than 2 NTU and as per PEQSDW the prescribed limit of Turbidity for drinking water is \(< 5\) NTU. Thus, it shows that there is no health risk involved in terms of turbidity. The results of total hardness indicate that more than 75% samples fall in the category of very hard water, above 10% in hard water and another 10% fall in moderate category. However, studies manifest that if the value of total hardness is less than 500 mg/l, there is no significant impact on human health (Sengupta, 2013). Out of 18 sampling points, only Mangal Mandi has sulfate values higher than the PEQSDW recommended range (\(<250\) mg/l). This is probably the reason that people in the proximity of Mangal Mandi site, especially infants, are vulnerable to face health effects of high sulfates such as gastroenteritis (Bashir et al., 2012).

The desirable concentration of chlorides is less than 10 mg/l. In contrast, when this value goes beyond the threshold of 250 mg/l, the water becomes salty and corrosive for some metals (Byron et al., 2009). Results show that above 17% of the sampling locations have values of chlorides in excess of PEQSDW recommended range (\(<250\) mg/l). As mentioned earlier, the main effect of excessive chlorides is change in the taste of water. Long-term exposure to this water can make people accustomed to the high level of chlorides, i.e., above 250 mg/l (Taricska et al., 2006).

Since, the recommended range of fluorides is between 0.5 and 1.5 mg/l, and therefore, the results show that around 40% of the samples have values of fluoride even less than the minimum recommended limit. Those sites where concentration of fluoride is less than the prescribe limit needs fluoridation before delivering it to the consumers (Rashid et al., 2012).

Similarly, Nitrate according to PEQSDW is a priority health indicator, which needs regular monitoring. However, in the collected samples, the values of Nitrates are found within the recommended range of PEQSDW, i.e., \(< 50\) mg/l (Manassaram et al., 2010). Dollay Wala and Mangal Mandi have values greater than 10 mg/l, and therefore, the groundwater in these two sites is not suitable for infants less than 6 months and pregnant women. Furthermore, 50% of the sampling locations have value of Nitrates greater than 1 mg/l that clearly indicates the intrusion of contamination from agricultural activities (Manivannan & Elango, 2017).

Arsenic and chromium have not been detected in any sample during the laboratory analysis. The reason for absence of these heavy metals is that the soil of this area is resistant against penetration of heavy metals into the groundwater (Pulido et al., 2012).

Fig. 3
figure 3

Spatial distribution of various parameters: a total alkalinity, b sulfates, c total hardness, d nitrate, e fluoride, f chloride, g pH, h turbidity, i total dissolved solids, and j electrical conductivity

Figure 2 also shows that the standard deviation of total alkalinity, sulfates, total hardness, and nitrate are 121.71, 94.68, 77.73, and 197.25, respectively. Whereas, TDS and EC have much higher standard deviations of 629.64 and 1302.83. Meanwhile, nitrate, fluoride, pH, and turbidity shows lower values of standard deviation, i.e., 5.41, 0.13, 0.40, and 0.27 as compared to other parameters.

Cross-correlation analysis

The cross-correlation analysis is performed to further elucidate the relation between various estimated parameters. The objective of this analysis is to determine the most dominant parameter(s) which are the potential key-drivers in the present sample space. Figure 4 shows the heat-map of the Pearson correlation coefficient (PCC) between various parameters. The PCC is determined using the following relation

$$\begin{aligned} R_{xy}=\frac{\sum _{i=1}^{I}{(x_i-\bar{x})(y_i-\bar{y})}}{\sum _{i=1}^{I}(x_i-\bar{x})^2\sum _{i=1}^{I}(y_i-\bar{y})^2} \end{aligned}$$
(1)

where R denotes the measure of similarity, x and y represents any two parameters, \(\bar{x}\) and \(\bar{y}\) are the mean values of the parameters (i.e., averaged over 18 samples), and \(i=(1, \cdots ,I)\) counts the number of sample locations with \(I = 18\). The PCC ranges between −1 and 1 that corresponds to negative correlation and positive correlation. The negative correlation means that two parameters are inversely proportional to each other while the positive correlation means that there is a linear positive relation between the variables. For the case where \(x = y\), the correlation becomes auto-correlation and reaches maximum (as can be seen in the diagonal elements of Fig. 4).

Fig. 4
figure 4

Cross-correlation matrix

In what follows, emphasis is given on the two parameters, which exceeds the permissible limits, i.e., total alkalinity and pH values. The correlation analysis shows that there is a negative correlation between these two parameters with \(R = -0.1612\). These findings are consistent with the general considerations where higher values of alkalinity is considered as a good indicator because it measures the buffering capacity (or resistance to change the pH) of the water. However, the negative correlation implies that large variation in alkalinity can destabilize the pH values of the water. Furthermore, the excessive level of alkalinity is the potential cause of skin infection and gastrointestinal related problems, which are commonly found in habitants. Another staggering issue is related to the correlation between total alkalinity and sulfate with \(R = 0.6026\). It is known that large values of these two variables are dangerous for the aquatic plants as well as cause degradation in vegetation if used for agricultural purposes (Qian et al., 2016). The highest correlation is found between EC and total dissolved solids with \(R = 0.9976\). This result is also consistent with the general understanding that large amount of TDS supports the life of coliforms in lake water (Ma & Zhang, 2019).

Principal component analysis and cluster analysis

Foregoing analysis is based on 1D correlation which only describes the general association between any two variables. However, it is of great importance to explore how the statistical properties of each parameter are linked with the others. For this purpose, the principal component analysis (also known as PCA) is performed. In PCA, the principal components (PCs) are determined using the singular-value-decomposition (SVD) of the data set. To perform the SVD, the covariance matrix of the data set is determined. The covariance matrix is defined as.

$$\begin{aligned} {\textbf{C}}=\frac{1}{n-1}{} {\textbf{B}}^\top {\textbf{B}} \end{aligned}$$
(2)

where C is the \(m \times m\) covariance matrix, \({\textbf{B}}\) is the \(n \times m\) data matrix, \(m = 10\) is the total number of estimated parameters, \(n = 18\) represents the total sample collection points, the superscript \(\top\) represents the transpose of the matrix. The SVD is performed using the MATLAB “svd” command over \({\mathbf{C}}\). The plot for eigenvalues is presented in Fig. 5. It shows that most of the useful information is contained in the first six eigen-vectors of the data. Figure 5 shows the 3D sub-plots of 10 principal components (PCs), wherein each subplot contains 3 PCs. The coordinates for each PC are determined by projecting the associated eigen vectors on the covariance matrix. For instance, PC1 represents the first principal component and corresponds to the highest eigenvalue in Fig. 5.

Fig. 5
figure 5

Plot for eigenvalues in the data-set

Figure 6 shows that the first 5 PCs are the most critical ones, and the remaining ones are insignificant. From statistical viewpoint, PC1 has the largest variability; and therefore, it contains the most useful information, then PC2 is the second important component and it continues until the eigenvalue plot tends to smooth (i.e., 6 PCs in our study). For instance, the estimated variables are widely spread in Fig. 6a, while the difference between them is insignificant in Fig. 6h. It can also be seen that the data distribution greatly reduce to a point as soon as higher order PCs are plotted starting from PC7 (i.e., Fig. 6e). In the interest of this paper, Fig. 6a–d are the most important ones because they exhibit large variability and have uncorrelated information that best represent the data set. The physical implication of PCA analysis is that the variables such as total alkalinity, sulfates, and EC have the largest variability with respect to the estimated parameters. Other parameters such as total hardness, nitrates, fluorides, and chlorides, pH, turbidity, and TDS do not have much variation among themselves.

Fig. 6
figure 6

Comparison between various principal components (PCs) of the correlation matrix of data set. Each subplot shows the projection of collected data on the two PCs. Subplot (a) to (d) show that the first 5 PCs cover the 98% variance of the data while subplot (e) to (h) are almost identical and carry insignificant information between the various parameters of the data set

Based on the foregoing analysis, it is found that some of the data sampling sites are more prone to bad groundwater than the others. To identify these vulnerable areas, the cluster analysis of the data is performed using dendrogram. In data mining, a dendrogram is used to demonstrate how certain variables are comparable to each other based on their properties. To apply the dendrogram-based cluster analysis, sample collection locations are used as the variables, and the chemical parameters are considered as their properties. Figure 7 shows the dendrogram of the data where x-axis shows the sample collection ID according to Fig. 1 (or Table 2). It can be seen that the properties of site location 3 (i.e., Mangal Mandi) makes the highest node with all other site locations, indicating the distinct distribution of parameters as than other locations. These findings are very consistent with the results described using box-plot analysis in Fig. 2. Following site 2, site ID 10 (i.e., Shrine Hazrat Baba Shah Kamal Chishti) is the second vulnerable site in terms of poor groundwater quality. The reason for these two locations to be worst makes absolute sense because these sample collection locations are closest to the Kasur Tanneries Waste Treatment Plant (KTWTP) and Pandoki (see Fig. 1).

Fig. 7
figure 7

Cluster analysis of the data-set using dendrogram

Discussion

The aforementioned results and data analysis indicate that the quality of groundwater in Kasur city is at stake due to poor management of tanneries effluent. To better understand the situation at managerial level, the authors of this paper made several visits to PHED and found that Kasur Tanneries Waste Treatment Plant (KTWTP) is out of commission since 2012. As a result, the tanneries waste is directly disposed into the Pandoki drain without any treatment. The Pandoki drain, which leads the water to the River Sutlej, is causing the surface water and groundwater pollution in the city. A similar visit to PCRWR was also made to collect information about drinking water quality assessment of the study area. It is found that PCRWR has not formally conducted any research on this matter but has noticed that water samples contains excessive amount of Arsenic and Chromium which is staggering.

The on-site visits depict that the study area has stagnant ponds in their vicinity, resulting into poor source of drinking water. Moreover, the sanitation and sewerage system were obsolete and out of order which has resulted into serious health issues (Afzal et al., 2014; Andleeb et al., 2019; Sarwar et al., 2018). Similar studies in previous years also reveal that the inhabitants of these areas are suffering from various water borne diseases like typhoid, cholera, hepatitis, giardiasis, dysentery, and intestinal disorders (Aziz, 2005; Afzal et al., 2014). Furthermore, it is found observed that the effluent discharged from tanneries is also used to irrigate crops, which consequently contaminated the edible crops and make it unfit for usage. The contaminated water has become part of inhabitants’ routine meal and now cultivated food and vegetables are the root cause of various diseases in these areas. KTWTP and Chrome Recovery Plant are not functional. As a result, people are using contaminated food or crops which is the reason for outbreak of various diseases. Overall, it is concluded that serious attention is due from the water and wastewater authorities to further investigate and monitor the groundwater quality of Kasur before the country strikes with another pandemic after COVID-19.

Conclusion

Tanneries effluent is one of the major sources which pollute groundwater quality. In this case study, the second largest tannery area of Pakistan, which is located in Kasur city, is studied. Water samples are collected from various source such as hand pumps, motor pumps, and tube wells. These samples are then analyzed in the laboratory to determine the various key physio-chemical properties such as total dissolved solids, pH, turbidity, electrical conductivity, total hardness, total alkalinity, nitrates, chromium, fluoride, and chloride. All of these foregoing parameters are compared against the Punjab Environmental Quality Standards for Drinking Water which are deduced from WHO’s guidelines. The estimated parameters are further analyzed using advanced data processing techniques such as principal component analysis and cluster analysis to elucidate the potentially hazards places in the study area. The results show that many of the sampling sites are at the verge of losing their water quality. Overall, based on the analysis of this paper, the following recommendations are made: (i) legislations should be made and practiced for controlled discharge of tanneries effluents, (ii) the water treatment plant in Kasur should be restored immediately, (iii) the pandoki drain should be lined and continuously monitored, and (iv) a detailed study with more sampling points should be conducted to make comprehensive analysis of water quality in Kasur.