Index-overlay method (PI)
The estimation of intrinsic aquifer vulnerability was conducted with the index-overlay PI method (Goldscheider et al. 2000). PI was selected due to its critical factors which were considered as more suitable for the specific conditions of the area (e.g., significant importance of protective cover, occurrence of “sinking stream”, etc.). Input data were derived from various sources, including groundtruth values from 71 boreholes (Fig. 1) of the karstic aquifer as well as from field work and literature (Tziritis 2008). The parameters used included the following: vertical stratigraphy of the boreholes, piezometric level, aquifer’s depth from the surface, fissuring and karstification degree of the substrate, topographic slope, lithology of the bedrock, type and origin of the quaternary deposits, occurrence of katavothres and sink holes, occurrence and route of blind river (a river sinking in a karstic formation) which in this case is River Melas, and finally spatial delineation of the hydrological sub-basins.
The PI method is a GIS-based approach with special consideration of karstic aquifers. It is based on an origin–pathway–target model (Fig. 2), where the origin is considered at the ground surface and the target is the uppermost aquifer. The pathway includes all the layers interfered between the extreme points of origin and target. The acronym stands for two factors, protective cover (P) and infiltration conditions (I). The simplified flowchart of the PI method is shown in Fig. 3.
Estimation of P-factor (protective cover)
The P-factor describes the protective function of the layers (soil, subsoil, non karstic rock, and karstic rock) between the ground surface and the water table, and is calculated according to a slightly modified version of the German (GLA) method (Holting et al. 1995) which is divided into five classes from 1 (very low protection) to 5 (very high protection). It expresses the impact of the protective cover on the basis of soils’ effective field capacity (eFC), subsoil’s grain size distribution, lithology, fissuring and karstification of the non-karstified and the karstified rock, thickness of all strata and mean annual recharge.
The above parameters were related to the “total protective function” (Goldscheider et al. 2000) to define the degree of total protection (P
TS):
$$P_{\text{TS}} = \left[ {T + \left( {\sum\limits_{i = 1}^{m} {S_{i} \times M_{i} + \sum\limits_{j = 1}^{n} {B_{j} \times M_{j} } } } \right)} \right] \times R + A$$
(1)
where T is the T-factor that refers to topsoil, S is the S-factor that refers to subsoil, B is the B-factor that refers to bedrock, M is the thickness (m) in the unsaturated zone, R is the R-factor that refers to recharge in terms of precipitation height, and A is the A-factor that refers to artesian conditions, if present.
The joint consideration of the above sub-factors resulted in the calculation of the final P-factor which is spatially distributed in Fig. 4. The range of P
TS values corresponds to a P-factor value (for more details see Goldscheider et al. 2000 and Goldscheider 2005) which ranges from P = 1 for an extremely low degree of protection to P = 5 for very thick and protective overlying layers.
Estimation of the Ι-factor (infiltration)
The I-factor describes the infiltration conditions, and more specifically the bypass degree of protection cover, due to lateral and subsurface water flow in the catchment of karstic holes and sinking (blind) streams. If the protective cover is completely bypassed by a shallow hole through which surface water may pass directly into the karstic aquifer, then I = 0, while I = 1 if the infiltration occurs diffusely (e.g., on a flat, highly permeable and free draining surface). The intermediate values occur in catchment areas of variable slopes, depending on the proportion of lateral flow components. The final protection factor “π” is extracted by the combination of P and I (π = Ρ × I) and subdivided into five classes, where π ≤ 1 indicates very low degree of protection leading to extreme vulnerability, while π = 5 indicates a very high protection with subsequent very low vulnerability (Vrba and Zaporozec 1994; Goldscheider et al. 2000). The spatial distribution of “π” factor with the use of a GIS system defines the final outcome which is the assessment of the karstic aquifer’s intrinsic vulnerability.
To determine the I-factor, three consecutive steps were carried out: firstly, the dominant flow process was estimated as a function of (1) topsoil’s (or the uppermost layer) saturated hydraulic conductivity (m/s), and (2) the depth to low-permeability layers inside or below the topsoil (or the uppermost layer). The combination of the above produced a map (GIS coverage) which determined the final result of dominant flow process.
Subsequently followed the estimation of the I′-factor which depends on (1) results from the previous step (dominant flow process), (2) topographic slope, and (3) vegetation. The values (raster files) of each cell grid of the aforementioned parameters were added and finally yielded the I′ factor which ranged from 0.0 to 1.0. The result (GIS coverage) is called I′-map and shows the occurrence and intensity of lateral surface and subsurface flow.
Finally, the last step included the compilation of blind’s river (sinking stream) catchment map. The conceptual approach of using this map is based on the assumption that lateral surface and subsurface flow may pose a risk to groundwater only if the contaminated water enters the karstic aquifer in a concentrated way, e.g., via a sinking stream, called as blind River; in the study area is Melas river (Fig. 5) sinks in a karstic hole at the northeastern region (great katavothre of Aghios Ioannis). According to the criteria imposed by the original PI method (Goldscheider et al. 2000), River Melas catchment was delineated in four buffer zones depending on relevant distances from the riverbed and the superficial occurrence of sinking holes and katavothraes. The obtained results included the construction of a relevant map (River Melas catchment map according to PI method criteria) with the aid of a GIS.
The final calculation of the I-factor embraced the combination of the previously assessed data and in turn compiled the I-map (Fig. 5) which shows the degree to which the protective cover is bypassed. Each grid cell is attributed a value derived from the intersection between I′-map (showing the occurrence and intensity of lateral flow) and River Melas catchment map (showing the sinking streams and their catchments).
Data processing and estimation of intrinsic aquifer vulnerability
The parameters/sub-factors which were considered to be spatially continuous (e.g., those derived by maps like soil, land use, surface geology, etc.) were input to GIS without any pre-processing as individual raster files, having a unique value for each grid cell. On the contrary, water table and thickness of subsoil were based on the spatial interpolation (IDW algorithm) of the data obtained from the 71 boreholes. In addition, the thickness of subsoil as well as the spatial distribution of subsurface stratigraphy (including fissuring and karstification degree) was assessed with the aid of the derived geological map and further optimized with groundtruth data from the 71 boreholes. Hence, each cell was attributed a unique value which was derived by the combinational (joint) overlay of the all the considered parameters/sub-factors.
The final estimation of intrinsic aquifer vulnerability resulted from the combination of the individual P and I factors (Ρ × Ι) for the entire coverage of study area, divided into 25 × 25 m grid cells. The final outcome is the map of Fig. 6 which shows the spatial distribution of karstic aquifer’s intrinsic vulnerability.
Statistical method
Despite the limited adoption of statistical methods for aquifer vulnerability studies, other branches of the earth sciences have extensively used these approaches in a wide range of applications (e.g., Conoscenti et al. 2013; Lombardo et al. 2014; Sahragard and Chahouki 2015). The procedure involves the identification of a dependent variable that a given algorithm, among the several available in literature, multivariately explains through a set of covariates. A calibration phase takes place initially when the selected algorithm learns to discriminate (through classification or regression) presence (in case of presence-only methods (Renner et al. 2015) or presence and absence [for presence–absence methods (Corani and Mignatti 2015)] of the dependent. A functional relation between predicted and predictors is subsequently derived and used to calculate probability of occurrences over broad areas such as catchments or regions. This prediction is then used to validate the model by testing its correctness against an unknown dataset. In this contribution, we adopted a presence-only approach known as maximum entropy (Phillips et al. 2006).
The dependent dataset has been generated by selecting among the 41 groundwater samples those with nitrate concentration above the screening value of 37 mg/L imposed by the Water Framework Directive (EU 2000). To increase the sample number a 73 m radius has been spanned from each borehole location defining as vulnerable all the centroids falling within these buffers in a synthetic 25-m grid coinciding with the study area. This operation has increased the vulnerable samples number to 258 points (virtual samples) spread across the eastern Kopaida plain but clustered in the proximity of the available boreholes. The predictor space coincides with the same 25-m grid being originally set by the available DEM. Primary and secondary topographic attributes have been calculated from the DEM obtaining: (1) elevation; (2) slope (Horn 1981); (3) profile curvature; (4) plan curvature; (5) catchment area; (6) topographic wetness index (Beven and Kirkby 1979); (7) landform classification (Tagil and Jenness 2008). These predictors have been selected to represent superficial and shallow topographic influences on the aquifer vulnerability. Geology and land use have been also added as predictors being analogously gridded. The predictor types thus include both continuous and categorical classes, the latter being coded as shown in Table 1. Furthermore, the calibration phase has exploited the random 75 % of the 258 vulnerable points while the validation has been performed onto the remaining 25 %. Ultimately, the process has been repeated ten times to allow evaluating the stability of the prediction across the replicates. The performance-evaluation criterion has focussed on three main steps. The first one assessed the predictive skill of the models through area under the curve (hereafter, AUC; Phillips et al. 2006). This parameter represents the integer of a receiver operating characteristic curve (hereafter ROC; Parolo et al. 2008) which links the proportion of positives cases that are correctly identified to the false-positive rate, for which Maxent applications are randomly extracted in the geographic space. Araújo and Guisan (2006) suggested AUC thresholds indicating average prediction between 0.7 and 0.8, good between 0.8 and 0.9 and excellent between 0.9 and 1. The second metric we adopted, investigated the percent contribution of each predictor with respect to the full model. This is commonly referred as predictor importance (Lombardo et al. 2015) and its computation establishes how much a given predictor affects the final probability value at a given cell. Complementarily, the role of each predictor has been evaluated through response curves (Lombardo et al. 2015). These curves show the relation between the final probability and the domain of each predictor. By setting a probability threshold at 0.5 it is possible to distinguish anti-correlations and positive relations between the dependent and each predictor.
Table 1 Table linking the categorical codes to their real correspondent
Validation of results
The validation of an intrinsic vulnerability map is always a challenging and complex task that needs caution to avoid any erroneous interpretations and misleading results. The most common validation process is to measure the concentration of a surface-released potential contaminant and to compare the spatial distribution of its value with the vulnerability results. However, this process is not always straightforward and further caution is needed. For example, a common mistake is to ignore the interferences between the contaminant and the geological environment which may alter its concentrations and have a significant impact on its overall fate and transport. The latter clearly highlights the importance of specific vulnerability which is by nature linked with those processes and is regarded as the most integrated approach for vulnerability assessment; however, even in cases of intrinsic vulnerability assessment, those interactions should be indirectly taken into account to validation process, to acquire representative results with minimized errors.
Another common error in vulnerability validation is the ignorance of lateral contaminant fluxes. By definition, aquifer vulnerability accounts for the vertical susceptibility of the system and is not considered for any lateral crossflows of contaminant plumes from adjacent hydrogeological units. Hence, practically validation with measured contaminant values at the saturated zone should be only performed at hydrologically “closed” systems, without any hydraulic connections with other units (surface or underground); if not, then potential migrations of contaminant plume(s) have to be taken into account prior to validation.
In the case of eastern Kopaida plain, validation was performed using the average nitrate values from 41 boreholes (Tziritis 2008) of wet (November–April) and dry (May–September) periods for the hydrological year 2004–2005. Essential precautions regarding the potential interactions and the migrating plumes were taken into account during validation process. Such interactions included abnormal deviations from the typical range of concentrations, for example, due to the development of strong anoxic zones within the heterogeneous karstic aquifer that dramatically decreased nitrates as a result of the reduction process. In addition, migrating nitrate plumes from adjacent basins often impart elevated values, which are not attributed to land use activities and physical properties of the specific study area.
The abnormal nitrate values due to redox conditions were excluded from the validation process as non-representative; their screening was based on previous hydrogeochemical assessments (Tziritis 2009, 2010) and it was further confirmed by their statistical population which were compiled by outliers, falling below the range of m − 2s (where “m” is the mean statistical value and “s” is the standard deviation). Accordingly, the borehole samples which were proved to be affected by lateral crossflows and contaminant migration were excluded from the validation process too. The screening was performed on hydrogeological, hydrogeochemical and isotopic criteria from previously conducted researches (Tziritis 2009, 2010).