Introduction

Globally, groundwater is a fundamental natural resource for the provision of drinking water, and plays a critical role in the quest for sustainable human life, as it is estimated that approximately, 30% of the world’s freshwater is stocked as groundwater with about 97% of all freshwater being potentially available for human use (Morris et al. 2003). In Africa, including Ghana, majority of the populations who depend on groundwater for domestic purposes, live in rural and peri-urban communities, where poverty predominates. Lack of access to quality groundwater in these communities in Africa, therefore, not only infringes on their basic human rights but also impact negatively on sustainable human life. Since the last decades, African governments have continually taken measures to provide their people with quality water from groundwater sources, due to the fact that, groundwater is not only feasible but also the most economical source of potable water for scattered and remote communities (Duah 2007). It is thus, paramount to ensure the management and development of groundwater within the rural communities to ensure sustainability.

During the past decades, interest in the geochemistry of groundwater has increased as demonstrated by several hydrogeochemical studies which are increasingly becoming a firm part of regional hydrogeological studies. Earlier studies on the categorization of groundwater facies and chemical evolutionary history employed graphical presentations of major-ion composition of groundwater (Piper 1944; Stiff 1951; Schoeller 1965; Hem 1989). These schemes were useful in visually describing differences in major-ion chemistry of groundwater and categorizing water compositions into identifiable groups which are usually of similar genetic history (Freeze and Cherry 1979).

Recently, multivariate statistical technique for hydrogeochemical assessment of groundwater has been applied with remarkable success as a tool in the study of groundwater chemistry. The application of multivariate statistical methods to geo-environmental data sets have facilitated the unveiling of hidden structures in the data sets and assisted in resolving key geo-environmental problems at various scales (Sandaw et al. 2012). Multivariate analysis of geochemical data operated on the concept that each aquifer zone has its own unique groundwater quality signature, based upon the chemical makeup of the sediments that comprise it (Fetter 1994; Suk and Lee 1999; Woocay and Walton 2008). The application of statistical analysis thus helps in the interpretation of complex data matrices to better understand the water quality as well as identify the possible factors that influence the water chemistry in a region.

Earlier examples of classical applications of multivariate statistical methods in the earth sciences are contained in Guler et al. (2002), Cloutier et al. (2008), Jiang et al. (2009) and Kim et al. (2009), the delineation of zones of natural recharge to groundwater in the Floridan aquifer (Lawrence and Upchurch 1982), the delineation of areas prone to salinity hazard in Chitravati watershed of India (Briz-Kishore and Murali 1992), characterization of groundwater contamination using factor analysis (Subbarao et al. 1995), analysis of marine water quality and source identification (Zhou et al. 2007), and the resolution of simple geo-environmental problems in the determination of groundwater flow directions (Farnham et al. 2003). The effectiveness of this method in groundwater chemistry over the traditional piper and stiff schemes stems from its ability to further reveal hidden inter-variable relationships and allow the use of virtually limitless numbers of variable, thus trace elements and physical parameters can be part of the classification parameters. By its use of raw data as variable inputs, errors arising from close number systems (mutual relation between variables with similar characteristics) are avoided. In addition, because elements are treated as independent variables, the masking effect of chemically similar elements that are often grouped together is avoided (Dalton and Upchurch 1978).

Owing to the discharge of mine effluents into river and stream sources through mining activities, several of the surface water resources which hitherto, served as potable sources for the communities within the Lower Pra Basin, have become polluted and unsuitable for use as drinking and potable water supply. Communities within the basin thus rely heavily on groundwater as sources for drinking and domestic purposes. However, sulphide rocks that contain gold ore, which is prevalent in the area often contain pyrite and arsenopyrites. Exposure of these rocks to the atmosphere often results in acid mine drainage generation (low pH waters) and subsequent mobilization of trace metals in high proportions into the groundwater system. The indiscriminate use of Hg and other chemicals through “small-scale” mining activities within the basin also lead to the pollution of surface- and groundwater resources. The resultant of all these is the change in the characteristics of both surface-and groundwater resources which serve as sources of potable water supply to the communities that rely on them. Earlier studies within the basin include: Catchment-Based Monitoring Project in Ghana-National IWRM Plan (2010), Ahialey et al. (2010), Bayitse (2011) and Tay et al. (2014, 2015). These studies failed to apply multivariate statistical methods to unveil the hidden structures and subsequently delineate the factors responsible for groundwater pollution within the Lower Pra Basin.

It is against this background that, this paper seeks to apply multivariate statistical analysis as a tool for a comprehensive hydrogeochemical assessment of groundwater in order to facilitate the unveiling of hidden structures in the data sets and assist in delineating the factors responsible for groundwater pollution for proper development and management of groundwater within the basin.

Materials and methods

Description of the study area

The Lower Pra Basin lies between 05°0′0″ and 06°0′0″N and 01°0′0″ and 02°0′0″W (Fig. 1). The climate falls under the wet semi-equatorial climatic zone of Ghana (Dickson and Benneh 1980). The basin comes strongly under the influence of the moist south-west monsoons during the rainy season. It is quite humid (relative humidity 60–95%) with annual rainfall in the range of 1500–2000 mm (Dickson and Benneh 1980). The average minimum and maximum temperatures are 21 and 32 °C, respectively (Dickson and Benneh 1980). The Pra Basin is part of the south-western basin system in Ghana and has a drainage area of 23,188 km2 and an estimated mean annual discharge of 214 m3 s−1 (Dickson and Benneh 1980). The basin lies entirely within the Forest Ecological Zone in Ghana (Dickson and Benneh 1980). It has moist semi-deciduous forest with valuable timber species (Dickson and Benneh 1980). Due to the expansion of the cocoa industry the original forest has changed to a secondary forest consisting of climbers, shrubs and soft woody plants (Dickson and Benneh 1980). Many trees in the upper and middle layers exhibit deciduous characteristics (Dickson and Benneh 1980). The basin is principally dominated by the forest orchrosols, and to a lesser extent, the forest orchrosol–oxysol integrate. The orchrosols are highly coloured soils with little leaching characteristics (Dickson and Benneh 1980).

Fig. 1
figure 1

Map of the study area (Ghana map inset) showing sampling communities within the various geological setting of the Lower Pra Basin

Land use

The land use pattern within the basin is primarily, farming (cocoa and food crops) and gold mining. Large acreages of virgin forest were removed and replaced with cocoa farms. In addition, food crops, such as cassava, yam, cocoyam, plantain, as well as fruits, such as banana; oranges etc, are produced together with cocoa for subsistence. Gold mining within the basin is of two types, “large-scale” and “small-scale” (“Galamsey”). “Large-scale” mining is conducted by heap leach technique or roasting of ore. Oxidised ores derived from sulphide principally arsenopyrites, realgar, orpiment, and pyrites in the weathered zones are heap leached by cyanidation (Kortatsi 2007). Paleoplacer (free milling ore) is mined from deep zones crushed, milled, and cyanided (Kortatsi 2007). “Small-scale” mining involves extracting gold from ochrosols soils mainly from stream floors by mercury amalgamation (Kortatsi 2007).

Geology

The Basin is characterized primarily by Cape Coast granitoid complex and a small percentage of Dixcove granitoid complex. Some portion of the basin is also characterized by Birimian and Tarkwaian Systems.

Cape Coast granitoid

Most part of the Cape Coast granitoid complex is predominantly granitic to quartz dioritic gneiss, which in the field is seen to change gradually from fine to medium grained, foliated biotite quartz diorite gneiss to exclusively hornblende-guartz-diorite gneiss (Ahmed et al. 1977). The gneissic rocks are interrupted by acidic and basic igneous rocks, such as white and pink pegmatite, aplites granodiorites, and dykes (Ahmed et al. 1977). Typically, the granitoids are associated with many enclaves of schists and gneisses. The Cape Coast granitic units are sometimes well foliated and often magmatic potash-rich granitoids in the form of muscovite, biotite, granite and granodiorite, granodiorites biotite gneiss, aplites and pegmatites (Ahmed et al. 1977). They are usually associated with Birimian metasediments and their inner structure is always concordant with those of their host rocks (Ahmed et al. 1977). The Cape Coast granitoid complex is believed to symbolize a multiphase intrusion comprising of four separate magmatic pulses. The last phase of the magmatic pulses is believed to be associated with the upper group of Birimian metasediments (Ahmed et al. 1977). The general mineralogical composition of the Cape Coast granitoid complex includes quartz, muscovite, biotite, microcline, albite, almandine, beryl, spessartite, tourmaline, columbite/tantalite and kaolin (Kesse 1985).

Dixcove granitoid

This complex consist of hornblende granite or granodiorite grading locally into quartz diorite and hornblende diorite, sometimes believed to have been formed from gabbros by magmatic differentiation (Ahmed et al. 1977). This complex forms non-foliated discordant to semi-discordant bodies in the enclosing country rocks which are generally Upper Birimian metavolcanics, numerous enclaves of which are found within the granite complex (Ahmed et al. 1977). The Dixcove granitoid complex is intruded along deep seated faults in three distinct phases which follow one another from basic to acid gabbrodiorite–granodiorite (Ahmed et al. 1977).

Birimian supergroup

The Birimian supergroup comprises the Lower and Upper Birimian and is separated from the Tarkwaian system by a major unconformity (Kesse 1985). The Lower Birimian is principally pelitic in origin having muds and silts with beds of coarser sediments (Kesse 1985). The Upper Birimian is predominantly of volcanic and pyroclastic origin (Kesse 1985). The rocks consist of bedded group of green lavas (greenstones), tuffs, and sediments with minor bands of phyllite that comprise a zone of manganiferous phyllites containing manganese ore (Kesse 1985). The sequence is intruded by batholithic masses of granite and gneiss (Kesse 1985). These principally argillaceous sediments were metamorphosed to schist, slate and phyllite, with some interbedded greywacke (Kesse 1985).

The Tarkwaian

The Tarkwaian consists of an overall fining-upwards thick clastic sequence of argillaceous and arenaceous sediments (mainly arenaceous) with two well-define zones of pebbly beds and conglomerate in the lower members of the system (Junner et al. 1942). The Tarkwaian rocks comprise slightly metamorphosed shallow-water, sedimentary strata, predominantly sandstone, quartzite, shale and conglomerate resting unconformably on and derived from rocks of the Birimian supergroup (Junner et al. 1942).

Aquifer characteristics

Boreholes within the basin are generally shallow with depths which ranged 22–96 m and a mean value of 44.42 m. Borehole yield is generally low and largely variable, ranging from 0.4 to 51.7 m3 h−1 with a mean value of 4.55 m3 h−1, with schists and granite aquifers having relatively higher yields. The fractures in the rocks are generally open. The granite and schist rocks are exposed, while, the Birimian and Tarkwian rocks have thick overburdens. The soils develop over the same kind of highly weathered parent material with lateritic to clayey top soil layer and thickness which generally ranged 4–14 m. However, the soil layer thickness may extend further in some areas. The static water levels of the boreholes generally ranged 0.4–22.4 m with a mean value of 6.37 m. Static water levels in most boreholes are above the top of the aquifer suggesting that the aquifers are either confined or semi-confined. The gneiss and granite associated with the Birimian rocks are of significant importance to the water economy of Ghana, since they underlie extensive and often well-populated areas (Dappah and Gyau-Boakye 2000). They are not inherently permeable, but secondary permeability and porosity have developed as a result of fracturing and weathering (Dappah and Gyau-Boakye 2000). Where, precipitation is high and weathering processes penetrate deeply along fracture systems, the granite and gneiss commonly have been eroded down to low-lying areas (Dappah and Gyau-Boakye 2000). On the other hand, where, the precipitation is relatively low, the granite occurs in massive poorly jointed inselbergs that rise above the surrounding lowlands (Dappah and Gyau-Boakye 2000). In some areas, weathered granite or gneiss form permeable groundwater reservoir (Dappah and Gyau-Boakye 2000). Major fault zones are also favourable locations for groundwater storage (Dappah and Gyau-Boakye 2000). The Birimian phyllite, schist, slate, greywacke, tuff, and lava are generally strongly foliated and fractured. Where, they crop out or are near the surface, considerable water may percolate through them (Dappah and Gyau-Boakye 2000).

Sampling and laboratory analysis

Fifty-four (54) (No) groundwater samples were collected from boreholes in January 2012 for quality assessment. Sampling protocols described by (Claasen 1982) and (Barcelona et al. 1985) were strictly observed during sample collection. Samples were collected using 4-1 acid-washed polypropylene containers. The samples were collected into 1 L polyethylene bottles without preservation. Samples for trace metals analyses were acidified to a pH <2 after filtration (Appelo and Postma 1999). All samples were stored on ice in an ice-chest. Samples for physico-chemical analyses were transported to the CSIR-Water Research Institute laboratories in Accra, stored in a refrigerator at a temperature of <4 °C and analyzed within 1 week. Temperature, pH, and electrical conductivity were measured on site using Hach Sens ion 156 m. Chemical analyses of the samples were carried out using appropriate certified and acceptable international procedures outlined in the Standard Methods for the Examination of Water and Wastewater (APHA 1998); sodium (Na) was analysed by flame photometric method; calcium (Ca) by EDTA titration; TDS by gravimetric method; Magnesium (Mg) by calculation after EDTA titration of calcium and total hardness; chloride (Cl) by argentometric titration; Nitrate-nitrogen was analysed by hydrazine reduction and spectrophotometric determination at 520 nm. Analyses of trace elements excluding arsenic and mercury were carried out using Unicam 969 Atomic Absorption Spectrophotometer (AAS), arsenic (As) determination was carried out using an ARL 341 hydride-generator), while, mercury (Hg) was determined using the cold vapour method at the Metals Section of the Environmental Chemistry and Sanitation Engineering Division laboratories of the Council for Scientific and Industrial Research-Water Research Institute (CSIR-WRI) in Accra. An ionic error balance was computed for each chemical sample and used as a basis for checking analytical results. In accordance with international standards, results with ionic balance errors greater that 5% were rejected (Appelo and Postma 1999). Charge balances were calculated using Eq. (1):

$${\text{CB}} = \left[\left(\sum zM_{\text{c}} - \sum zM_{\text{a}} \right)/\left(\sum zM_{\text{c}} + \sum zM_{\text{a}} \right)\right] \times 100$$
(1)

where z is the ionic charge, M is the molality, and the subscripts a and c refer to anions and cations, respectively.

Spearman’s correlation matrix

Coefficient of correlation (r) was used to understand the relationship between the various parameters and to test the significance of the models. The Spearman’s correlation matrix was generated using Statistical Programme for Social Sciences (SPSS) 16.0 for windows. Correlation matrix was studied to point out any relationship between the observed parameters in order to explain factor loadings during PCA. In other words, correlation matrix was utilized to point out the internal structures and assist in the identification of pollution sources not accessible at first glance (Satheeshkumar and Anisa Khan 2011). High correlation coefficient value (i.e., −1 or 1) predicts a good relation between two variables and correlation coefficient value around zero (0) predicts no relationship between the two variables at a significant level of P < 0.05. Parameters showing r > 0.7 are considered to be strongly correlated whereas r between 0.4 and 0.7 shows moderate correlation and parameters showing r < 0.4 shows low to no correlation.

Principal component analysis (PCA)

PCA is a very powerful technique used to reduce the dimensionality of a data set consisting of a large number of interrelated variables while retaining as much as possible the variability presented in a data set (Zhang et al. 2009). The reduction is achieved by transporting the data set into a new set of variables- the principal components (PCs), which are orthogonal (non-correlated) and are arranged in decreasing order of importance (Zhang et al. 2009). PCA technique extracts the eigenvalues and eigenvectors from the covariance matrix of original variables. PCA is designed to transform the original variables into new, uncorrelated variables (axes), called the principal components, which are linear combinations of the original variables. The new axes lie along the directions of maximum variance (Shrestha and Kazama 2007). PCA reduces the dimensionality of the data set by explaining the correlation amongst large number of variables in terms of a smaller number of underlying factors without losing much information (Vega et al. 1998; Alberto et al. 2001). PCA can be expressed mathematically as presented in Eq. (2):

$$Z_{ij} = pc_{i1} x_{1j} + pc_{i2} x_{2j} + \cdots pc_{im} x_{mj}$$
(2)

where z is the component score, pc is the component loading, x is the measured value of the variable, i is the component number, j is the sample number, and m is the total number of variables.

Statistical analysis

Statistical analyses were performed using SPSS 16.0 for windows. PCA technique was used to reduce the dimensionality of the data set while retaining the variability presented in a data set as much as possible. The Spearman’s correlation matrix was generated to determine any relationship between the observed parameters in order to explain factor loadings during PCA. In order to ensure normality of the data, all hydrochemical data (except pH) were log-transformed prior to statistical analyses. The hydrochemical data was also auto-scaled by calculating the standard scores (z scores) and ensuring that all z scores are <±2.5. For trace metals with concentrations below their detection limits, one-half of the value of their respective detection limit was substituted and used in statistical analysis. A probability value of P < 0.05 was considered as statistically significant in this study.

Results and discussion

The hydrochemical data for groundwater within the basin and their GPS Coordinates is presented in Table 1, while the trace metal levels in groundwater and the GPS Coordinates is presented in Table 2. The statistical summary of the hydrochemical data is presented in Table 3. The Spearman’s correlation matrix generated (Table 4) indicate that pH shows low-to-moderate correlation with all major and minor ions (except K and NO3–N). The Spearman’s correlation matrix also shows that HCO3 had relatively high correlations with major ions. According to Hounslow (1995), essentially in silicate weathering reactions, bicarbonate is produced, suggesting that HCO3 perhaps, originates primarily from silicate weathering reactions in groundwater within the basin. Total dissolved solids (TDS) show strong correlation with, Ca2+ (r = 0.78; p < 0.05), Mg2+ (r = 0.71; p < 0.05), Na+ (r = 0.72; p < 0.05), K+ (r = 0.62; p < 0.05), Cl (r = 0.74; p < 0.05) and SO4 2− (r = 0.64; p < 0.05) (Table 4) suggesting that these major ions contributes positively to the total dissolved solids of the groundwater and can be accounted for by a major geochemical process, perhaps aluminosilicate weathering and also originating from the same source (Subba Rao 2002). Correlation analysis of major ions revealed expected process-based relationships between Mg2+ and Ca2+ (r = 0.84; p < 0.05), Ca2+ and Na+ (r = 0.79; p < 0.05), Ca2+ and K+ (r = 0.65; p < 0.05), Ca2+ and Cl (r = 0.84; p < 0.05), Ca2+ and SO4 2− (r = 0.79; p < 0.05), Ca2+ and HCO3 (r = 0.73; p < 0.05), Mg2+ and Na+ (r = 0.87; p < 0.05), Mg2+ and K+ (r = 0.72; p < 0.05), Mg2+ and Cl (r = 0.94; p < 0.05), Mg2+ and SO4 2− (r = 0.93; p < 0.05), Na+ and K+ (r = 0.79; p < 0.05), Na+ and Cl (r = 0.93; p < 0.05), Na+ and SO4 2− (r = 0.91; p < 0.05), K+ and Cl (r = 0.78; p < 0.05), K+ and SO4 2− (r = 0.81; p < 0.05) and Cl and SO 24 (r = 0.92; p < 0.05), derived mainly from the geochemical processes, such as ion-exchange and silicate/aluminosilicate weathering within the aquifer. These process-based relationships between the observed parameters may be due to mineralogical influence which would be explicitly explained by factor loadings during principal component analysis (PCA). The correlation between Cu2+ and Zn2+ (r = 0.92; p < 0.05) reveals the possible existence of a process-based (biochemical) relationship between the two metals. Zinc is one of the earliest known trace metal and a common environmental pollutant which is widely distributed in the aquatic environment, while copper is intimately related to the aerobic degradation of organic matter (Das and Nolting 1993). Aerobic degradation of organic matter in groundwater within the basin may, therefore, be responsible for the strong correlation between Cu2+ and Zn2+. The correlation matrix also shows the expected strong positive correlation between total hardness (TH) and Ca2+ (r = 0.86; p < 0.05), TH and Mg2+ (r = 0.71; p < 0.05) as calcium and magnesium ions are naturally responsible for hardness in water.

Table 1 Hydrochemical data of groundwater within the Lower Pra Basin and their GPS coordinates
Table 2 Trace metal levels in groundwater within the Lower Pra Basin and their GPS Coordinates
Table 3 Summary statistics of hydrochemical data for groundwater within the Lower Pra Basin
Table 4 Spearman’s correlation matrix for groundwater within the Lower Pra Basin

Data analysis using principal component analysis (PCA)

PCA using Varimax with Kaiser normalization has resulted in the extraction of three main principal components which identifies the factors influencing each principal components for the physico-chemical parameters. The three principal components have accounted for approximately 79% of the total variance in the hydrochemical data. Table 6 presents the determined initial principal component and its eigenvalues and per cent of variance contributed in each principal component, while, Table 8 presents the rotated component matrix of the main physico-chemical parameters. The component plot in rotated space is presented in Fig. 2. An eigenvalue gives a measure of the significance of the factor and the factor with the highest eigenvalue as the most significant. Eigenvalues of 1.0 or greater are considered significant (Kim and Mueller 1978). Factor loadings are classified as ‘strong’, ‘moderate’ and ‘weak’ corresponding to absolute loading values of >0.75, 0.75–0.50, and 0.50–0.30, respectively (Liu et al. 2003).

Fig. 2
figure 2

Component plot in rotated space for groundwater within the Basin

Component 1 explains nearly 51.9% of the total variance (Table 6) and has strong positive loadings (>0.75) for EC, TDS, Mg2+, Ca2+, Na+, K+, Cl and SO4 2− and a weak positive loading for HCO3 (Table 5) suggesting that the major ions contribute positively to the total dissolved solids of the groundwater and can be accounted for by major geochemical processes within the aquifer. By their definitions, TDS is the total dissolved solids, while, EC is the total ions in solution. In general, a plot of TDS against EC shows a linear relationship with slope (m), and TDS − conductivity factor (r 2). The general equation for this linear graph can be represented as KA = S, where, K is the EC (µS/cm), S is the TDS (mg/L), and A is a constant which defines whether a particular water type is high in HCO3 , SO4 2− or Cl (Clark and Fritz 1997). Tay et al. (2014) reported that, 72.4% of groundwater within the basin had A = 0.55 and therefore, suggest that groundwater within the basin is high in HCO3 , and probably suggest the role of silicate weathering by carbon-dioxide charged water during water–rock interaction in the aquifers. This is also consistent with the TDS-EC correlation in the Spearman’s correlation table (Table 4), where, TDS show strong correlation with EC (r = 0.96; p < 0.05). Thus, the strong positive loadings of the major ions together with EC and TDS in Component 1 are expected and suggest their contribution to major geochemical processes through mineralogical influence.

Table 5 Component matrix of the main physico-chemical parameters

Component 2 explains approximately 17.5% of the total variance (Table 6) and has strong positive loadings for pH, SiO2 and HCO3 and weak negative loadings for PO4–P and NO3–N (Table 5) reflecting a common source, clearly, silicate/aluminosilicate weathering by carbon-dioxide charged water. The strong positive loadings for SiO2, HCO3 and pH in the groundwater, suggests sorption of silica by clay minerals (Siever and Woodward 1973). This is consistent with the results by Tay et al. (2014) that, dissolved silica in groundwater within the Lower Pra Basin originates from the chemical breakdown of silicates during weathering processes.

Table 6 Total variance explained

Component 3 explains approximately 9.5% of the total variance (Table 6) and has moderate positive loading for NO3–N and moderate negative loading for PO4–P (Table 5). Component 3 though reflects a common source of anthropogenic origin (possibly pollution from human induced activities, such as inorganic fertilizer), it shows how NO3–N and PO4–P correlates significantly with each other, i.e., where, NO3–N concentration is high, PO4–P concentration is low. The economic activity within the basin is primarily, farming, where foodstuffs, such as yam, plantain, banana, vegetables, fruits, and cash crops, such as cocoa, oil palm, and coffee, are grown. Land degradation as a result of poor farming practices where indiscriminate use of nitrogen and phosphorus based fertilizers are widespread and in some cases agrochemicals are used, are some of the human induced activities which are most likely to have anthropogenic impact on the water resources within the basin.

The results of the PCA for the physico-chemical and trace metals data using Varimax with Kaiser normalization rotation are presented in Tables 7 and 8. Five principal components accounting for 85.2% of the total variance have been extracted on the basis of the eigenvalues >1 (Table 8). The first three principal components explain 48.09, 14.62 and 8.47% of the total variance, respectively. The fourth and fifth principal components are considerably less important, explaining only 7.02 and 6.94% of the total variance, respectively. Thus, the first three principal components as extracted in Table 7, accounting for a large proportion (71.2%) of total variance in the hydrochemical data are considered. Table 7 presents the determined initial principal component and its eigenvalues and per cent of variance contributed in each principal component, while, Table 8 presents the rotated component matrix of the main physico-chemical and trace metal parameters. Component 1 explains nearly 48.09% of the total variance (Table 8) and has strong positive loadings (>0.75) for the major ions (EC, TDS, Mg2+, Ca2+, Na+, K+, Cl and SO4 2−), a moderate positive loading for HCO3 and a weak positive loading for pH (Table 7) suggesting that, the major ions contribute positively to the total dissolved solids of the groundwater and can be accounted for by major geochemical processes within the aquifer.

Table 7 Component matrix of hydrochemical data for groundwater within the Lower Pra Basin
Table 8 Rotated component matrix of the main physico-chemical and trace metal parameters

Component 2 explains approximately 14.67% of the total variance (Table 8) and has moderate negative loadings for HCO3 , pH; moderate positive loadings for NO3–N and Mn and a strong negative loading for SiO2 (Table 7) reflecting a natural source (silicate/aluminosilicate weathering by carbon-dioxide charged water) and anthropogenic source (use of inorganic fertilizer in agricultural activities). Component 3 explains approximately 8.47% of the total variance (Table 8) and has moderate positive loadings for Pb and Fe; moderate negative loadings for Cu and Zn and weak positive loadings for Hg and Se (Table 7). Component 3 though, reflects a common source of trace metal mobilization, it shows how Pb, Fe, Hg and Se and; Cu and Zn correlates, i.e. where Pb, Fe, Hg and Se concentration is high, Cu and Zn concentration is low. This PCA results is consistent with results from the Spearman’s Correlation matrix that, correlation between Cu2+ and Zn2+ (r = 0.92; p < 0.05), reveals the possible existence of a process-based relationship between the two metals.

The loadings and score plots of the first two PCs which explain 62.75% of variance is presented in Fig. 3. Figure 3 shows grouping and relationship between the variables. The major, EC and TDS are visible in the first and second quadrants and have been shown to group together indicating their close relations. HCO3 , pH and SiO2 have also been shown to group together indicating their relationship and significance in silicate weathering within the basin, while, trace metals have also grouped together reflecting a common source. This grouping pattern shows the strength of the mutual relation among the hydrochemical variables.

Fig. 3
figure 3

Loadings and score plot for the first two PCs

Thus, from the PCA, it can be deduced that, Component 1 delineates the main natural processes (water–soil–rock interactions) through which groundwater within the basin acquires its chemical characteristics, Component 2 delineates the incongruent dissolution of silicate/aluminosilicates, while, Component 3 delineates the prevalence of pollution principally from agricultural input as well as trace metal mobilization in groundwater within the basin.

Hydrogeochemical processes influencing groundwater within the Lower Pra Basin

According to Tay et al. (2014), the major processes responsible for chemical evolution of groundwater within the basin include; silicate (SiO4)4− dissolutions, ion-exchange reactions, sea aerosol spray and pyrite (FeS) and arsenopyrite (FeAs) oxidations. From Table 3, groundwater within the basin is strongly acidic to neutral, with 81% of boreholes recording pH outside the WHO (2004) Guideline Values for drinking water. The pH levels in groundwater within the basin is due principally to natural biogeochemical processes and the presence of silicates/aluminosilicates found within the basin may probably be responsible for the acid neutralizing potential of groundwater within the basin (Tay et al. 2014). From Fig. 4a, b, the contributions of Na+, Ca2+, Mg2+ and K+ are 41, 31, 16 and 12%, respectively, while, major anion contribution of HCO3 , Cl and SO4 2− are 53, 28 and 19%, respectively. The hydrogeochemical transport model Phreeqc for Windows was used to assess the state of saturation of the groundwaters with respect to the major minerals (Table 9). Figure 5, presents the plot of calcite against dolomite saturation indices of groundwater within the Basin. Results show that, groundwaters within the basin are subsaturated with respect to both calcite and dolomite and therefore, represents waters that have come from environments where calcite and dolomite are depleted or where Ca2+ and Mg2+ exist in other forms. Groundwaters within the basin thus, have not reached equilibrium with the carbonates due to short residence times. Tay et al. (2014), using groundwater geochemistry in determining the origin of major dissolved ions showed that, the chemical composition of groundwater within the basin is the combined chemistry of the composition of water that enters the groundwater reservoir and their reactions with the mineralogy of granitic rocks (biotite, muscovite), schist rocks (biotite, hornblende and actinolite), pyrite and arsenopyrites as the water travels along the mineral surfaces in the pores or fractures of the unsaturated zones and the aquifer. The stability of plagioclase (anorthite) and its secondary weathering products gibbsite, kaolinite and Ca-montmorillonite with respect to groundwater within the Lower Pra Basin showed that, consistent with natural waters with low silica concentrations, most of the groundwaters plot in the kaolinite-stability field, while, the stability of albite and its secondary weathering products gibbsite, kaolinite, and Na-montmorillonite with respect to groundwater within the basin showed that, consistent with natural waters with low silica concentrations most of the groundwaters plot in the kaolinite-stability field indicating that; kaolinite is the most stable secondary silicate mineral phase for the groundwater system. Thus, silicate/aluminosilicate weathering processes may have contributed significantly to the Ca2+, Mg2+ and Na+ concentrations in groundwater within the basin (Tay et al. 2014). Stable isotopes (2H and 18O) results showed that, the waters emanated principally from meteoric origin with evaporation playing an insignificant role on the infiltrating water (Tay et al. 2014). Tay et al. (2015) assessed the most relevant controls on groundwater quality within the basin using Q-mode hierarchical cluster analysis (HCA). The Q-mode HCA characterized hydrochemical data into four (4) water groups and five (5) subgroups. The results from Tay et al. (2015) delineated two main water types- the Na–HCO3 and Ca–Mg–HCO3 with Na–Cl and Ca–Mg–Cl as minor water types. The results further showed that, Groups 1 and 2 waters both represents transition zones between Ca–Mg–HCO3/Na–HCO3 and Na–Cl/Ca–Mg–Cl type waters and therefore, can be regarded as transition zones between naturally circulating groundwaters which have not undergone pronounced water–rock interaction/aggressive recharging alkali carbonate waters and limited recharging local rain/permanent hard water. Furthermore, Tay et al. (2015) also showed that surface waters within the basin are principally of Na–HCO3 type waters and therefore are reminiscent of aggressive recharging waters that may potentially be serving as recharge reservoirs to groundwater within the basin. Tay et al. (2015) concluded that, groundwater within the basin perhaps evolves from fresh-Ca–Mg–HCO3/Na–HCO3 type waters to permanent hard –Ca–Mg–Cl type waters and limited recharging local rain- Na–Cl type waters along the groundwater flow paths principally due to ion-exchange reactions and that, the surface waters within the basin may potentially be serving as recharge reservoirs to groundwater within the basin. However, Zion Camp area may be serving as discharge areas to groundwater within the basin. PCA using Varimax with Kaiser normalization rotation has resulted in the extraction of three main principal components which identifies the factors influencing each principal components for the main physico-chemical parameters. The three principal components have accounted for approximately 79% of the total variance in the hydrochemical data. Component 1 delineates the main natural processes (water–soil–rock interactions) through which groundwater within the basin acquires its chemical characteristics, Component 2 delineates the incongruent dissolution of silicate/aluminosilicates, while Component 3 delineates the prevalence of pollution principally from agricultural input as well as trace metal mobilization in groundwater within the basin.

Fig. 4
figure 4

a, b Relative proportions of the major dissolved constituents in groundwater within the Basin

Table 9 Saturation indices for groundwater calculated using Phreeqc for Windows
Fig. 5
figure 5

A plot of calcite against dolomite saturation indices of groundwater within the Basin

Conclusion and recommendations

The application of multivariate statistical technique for groundwater assessment within the Lower Pra Basin have shown that, correlation matrix of major ions revealed expected process-based relationships derived mainly from the geochemical processes, such as ion exchange and silicate/aluminosilicate weathering within the aquifer. Spearman’s Correlation matrix and PCA results show the possible existence of a process-based relationship between Cu2+ and Zn2+ (r = 0.92; p < 0.05). Three main principal components influence the water chemistry and pollution of groundwater within the basin. The three principal components have accounted for approximately 79% of the total variance in the hydrochemical data. Component 1 delineates the main natural processes (water–soil–rock interactions) through which groundwater within the basin acquires its chemical characteristics, Component 2 delineates the incongruent dissolution of silicate/aluminosilicates, while, Component 3 delineates the prevalence of pollution principally from agricultural input as well as trace metal mobilization in groundwater within the basin. In terms of trace metal mobilization, the study show that though, the trace metals reflects a common source of mobilization, where Pb, Fe, Hg and Se concentrations are high, Cu and Zn concentrations are low. The loadings and score plots of the first two PCs show grouping pattern which indicates the strength of the mutual relation among the hydrochemical variables. In terms of proper management and development of groundwater within the basin, communities where intense agriculture is taking place should be monitored and protected from agricultural activities especially, where inorganic fertilizers are used by creating buffer zones. Monitoring of the water quality especially the water pH is recommended to ensure continuous acid neutralizing potential of groundwater within the basin thereby, curtailing further trace metal mobilization processes in groundwater within the basin.