1 Introduction

In many European countries, groundwater is the main source of drinking, irrigation and industrial water supply. Its contribution is even higher, where surface flow is limited or temporary. According to Eurostat data (Fig. 1) (Eurostat 2016), groundwater is the one of main source of water for agricultural, industrial and domestic water use. The percentage of groundwater contribution ranges from less than 10% for countries with large surface water supply, e.g., Finland, Lithuania, the Netherlands and Belgium, to more than 90% for countries with scarce surface water supply, e.g., Malta, Denmark and Iceland. Reasons for the surface water scarcity vary, due to climate or topography. For example, for Malta groundwater is the only source to overcome water scarcity due to its semiarid climate, Iceland uses mainly groundwater because it also harnesses its geothermal energy, while in other countries with high groundwater use, e.g., Croatia and Cyprus, surface water use is restricted by their karstic underground and lack of big rivers. The average groundwater contribution to supply in the 30 countries, where data are available, is more than 35%.

Fig. 1
figure 1

Average percentage of water abstraction by source in European countries (Source: Eurostat)

The non-negligible share of groundwater to the freshwater supply makes quite important, when implementing hydrological models, to keep the groundwater part of the model, on par with its surface water counterpart. This is particularly important, if we want to apply the developed model to many different regions. In this work, we explain the implemented changes to and first applications of an already established and well-documented integrated surface water / groundwater model, namely LISFLOOD (Alfieri et al. 2013; Jongman et al. 2014). Especially the work on water resources management (de Roo et al. 2016; Udias et al. 2016) shows that groundwater is still not adequately represented in large scale modelling in contradiction to the important factor groundwater has for water management. At this stage, the main changes comprise improving the representation of groundwater processes in it, and adding missing elements like: a) the ability of groundwater to move in a two-dimensional way between cells; and b) the bi-directional connection of the groundwater with the river. To that end, we used a conceptual MODFLOW model, as the means of groundwater flow simulation among the cells (yellow arrows, Fig. 2) and between the groundwater and the river (blue arrows, Fig. 2). This way, we can detect the paths of groundwater in two dimensions and see the exchanges between river and aquifer.

Fig. 2
figure 2

Coupling of river and groundwater. The blue arrows show flow between aquifer and river, while the yellow arrows show flow between two adjacent groundwater cells

The interaction between surface water and groundwater models has been the subject of many publications in the past. Different approaches of this subject include, among others, full blown 3D models (Spanoudaki et al. 2009) and soil-water balance models combined with groundwater fluctuation (Jie et al. 2011). Our approach is something in the middle of these two approaches, with soil-water balance included in LISFLOOD and a two-dimensional groundwater flow model (MODFLOW). For more information on surface water and groundwater interaction models you can also refer to a very thorough recently published review by Barthel and Banzhaf (2016).

Running an integrated surface water and groundwater model at a large scale is a challenging task, because of the complexity of all the processes included in the hydrological cycle, the considerable amount of data necessary and mainly because of the uncertainty of large-scale groundwater data, both for parameterizing the model as well as for verifying the results. Previous works in large-scale simulations include publications mainly focused either on surface water resources (Werner et al. 2005; de Roo et al. 2014), or on groundwater (Sutanudjaja et al. 2011; Fan et al. 2013; de Graaf et al. 2015), with few applications of coupled models of surface water and groundwater (Sutanudjaja et al. 2013). In smaller scales, attempts of coupled surface and groundwater modeling have been more frequent. In the past, there were claims that the integrated modelling concept might be proven too complicated for researchers to apply (Jeffrey and Gearey 2006). Fortunately, the evolution of computational efficiency has helped, so nowadays we can run computationally intensive models significantly faster in respect to a decade ago.

The current work is building on the already known and tested in water resources management model LISFLOOD and it improves its ability to simulate groundwater processes. The main innovation is that the MODFLOW coupling allows water from the lower zone to move in a two-dimensional way among adjacent cells. The coupled model will be able to simulate better the processes of the hydrologic cycle and provide results for better understanding and managing of water resources. Another novelty is the possibility to have a two-way connection of the lower zone with the river. Because of this, the model can now produce a map of recharge and discharge zones of the aquifers.

2 Methodology

The main goal of the present work is the coupling of a finite-differences groundwater model (MODFLOW) with an already established hydrological model (LISFLOOD), to improve the latter’s ability to simulate the groundwater part of the hydrological cycle. The desired outcome of the final product was the identification of groundwater and surface-water interaction zones (recharge areas of aquifers from rivers and vice versa). This is a necessary step for the further improvement of the groundwater module that exists in LISFLOOD but does not allow for interactions (i.e., transfer of water volumes) among adjacent cells. Because of the model’s large-scale application, some conceptual thinking is required to overcome limitations inherent in the two original models. The main limitation considering the MODFLOW model, or any other similar groundwater algorithm, is the need for an explicit and detailed definition of the geological formations together with hydraulic conductivity and porosity information. Instead of trying to explicitly describe the geological formations, in this first iteration, we relied on large scale data, with qualitative information about the existence and productivity of aquifers.

2.1 Model Setup

The model setup was based on previous iterations of the LISFLOOD hydrological model. LISFLOOD is a spatially distributed (grid-based) hydrological rainfall-runoff and routing model, including a hydrodynamic channel routing routine, developed by the JRC (de Roo et al. 2000; van der Knijff et al. 2010). It calculates a complete water balance in daily time steps and for every grid-cell.

The LISFLOOD model is made up of the following components:

  • a two-layer soil water balance sub-model;

  • sub-models for the simulation of groundwater and subsurface flow (using two parallel interconnected linear reservoirs);

  • a sub-model for the routing of surface runoff to the nearest river channel;

  • a sub-model for the routing of channel flow.

The processes that are simulated by the model include snow melt, infiltration, interception of rainfall, leaf drainage, evaporation and water uptake by vegetation, surface runoff, preferential flow (bypass of soil layer), exchange of soil moisture between the two soil layers and drainage to the groundwater, sub-surface and groundwater flow, and flow through river channels (Burek et al. 2013).

The meteorological variables driving the LISFLOOD model (precipitation, temperature, wind speed, potential evapotranspiration, and evaporation rates for open water and bare soil surfaces) were derived from various data sources, as it has also been done in previous applications of this model (Udias et al. 2016). The sources include the JRC MARS meteorological database, SYNOP data, as well as data from the European Climate Assessment & Dataset (ECA & D). All meteorological variables were interpolated on a 5 × 5 km2 grid. Land use maps of forest, water and sealed (impermeable surface) fractions were derived from LUMP (Lavalle et al. 2011) at 100 m resolution (Udias et al. 2016). SYNOP data stations measure among other parameters, daily precipitation and soil moisture around Europe, while the LUMP is a land use model developed by the Joint Research Centre of the European Commission to support impact assessment of European environmental policy.

In the Pan-European setup, it runs on a grid of 5 × 5 km2, which includes Europe and the Mediterranean region. Each pixel of the grid has many attributes, which characterize the variability of parameters within the 5 × 5 km2. For more in depth information, on how the LISFLOOD model is set up, or the calculations that constitute the surface water part of the model, please refer to the manual (van der Knijff et al. 2010) and previous publications (Burek et al. 2013). The version of LISFLOOD that we used for this work is the 2013 version written in PCRaster programming language. We modified the original code to substitute the groundwater simulation module that already existed in LISFLOOD, with a two-dimensional groundwater model.

The groundwater model used here is the MODFLOW model, called in a console mode (i.e., without graphical user interface) from within the modified LISFLOOD PCRaster code. The fact that MODFLOW works on a grid of cells, just as LISFLOOD does, allowed us to use the same resolution and practically run both models on the same grid. This way, the resulting model has a one-to-one coupling of surface and groundwater cells (Fig. 2). The coupling has been done in an iterative way. This means that for each time step, the coupled model runs the surface model equations and subsequently evokes the groundwater model with the calculated results. Next, it feeds the results of the groundwater model back to the surface water one, before proceeding to the next time step.

The MODFLOW model simulates the processes and substitutes what used to be the lower zone in the LISFLOOD model (Fig. 3). The added / modified processes include:

  • the river channel can now also send water to the groundwater instead of only receiving;

  • the losses of the groundwater have been substituted by groundwater exchanges among adjacent nodes.

Fig. 3
figure 3

Changes in the model from the previous version

The river may either contribute water to the groundwater system, or act as groundwater discharge zone, depending on the hydraulic gradient between them. MODFLOW simulates the surface water/groundwater interaction through a seepage layer that separates the river from the aquifer. The equation used is a classical Darcy equation. The model requires the following input information for each grid cell containing a River boundary: (a) water elevation in the river; (b) riverbed elevation; and (c) conductance of the seepage layer.

With this new setup, the groundwater does not disappear from a cell, but it drives through the aquifers and ultimately discharges to the sea. This approach is already more in accordance to the physical phenomenon. One limitation still exists when there are aquifers which discharge to springs in cells that are not adjacent to the aquifer cells. In that case, the model is still unable to account for that transfer of water, since it cannot simulate exchanges between non-adjacent cells.

All the input maps of the coupled model were created for the aforementioned grid setup. These included daily values of meteorological and hydrological parameters, including hydraulic conductivity and porosity of the aquifers, the riverbed elevation and hydraulic conductivity, the initial head in the aquifer and the boundary conditions. The fact that we work in a Pan-European scale allowed us to use as a boundary condition, of constant hydraulic head, the actual shoreline of the continent.

An example of an input map to the model is the hydraulic conductivity in meters per day (Fig. 4). Although the surface model is set to run in the full extend, the availability of data limited the groundwater component to Europe and part of Turkey. Moreover, because of the coupled model’s higher computational effort need, we decided to limit its nodes only to those where the groundwater is present and significant. Using the International Hydrogeological Map of Europe (IHME) (Duscher et al. 2015) as a reference, we excluded those grid cells, where there was “local and limited groundwater” or “essentially no groundwater” according to the IHME classification (Trichakis et al. 2016).

Fig. 4
figure 4

The Pan-European model setup; example of an input map (hydraulic conductivity [m/d])

Worth noting is that the two-dimensional approach adopted in this work represents the aquifers that are in direct hydraulic connection with the river. The lack of detailed data concerning stratigraphy and lithology prevented a three-dimensional approach which could include also deeper aquifers and/or aquifers disconnected from the river network. Pending collection of more detailed information, which has already started in some countries of the Danube river basin, the model can be transformed into a three-dimensional setup at the same or even higher resolution.

As shown in Fig. 3, in the coupled model the water transfer from the upper to the lower zone is the input of the MODFLOW model at every iteration. The simulated hydraulic head is the model’s outputs together with the water exchanges between the river and the aquifer. The latter becomes an input for the surface model in the next iteration, thus completing the coupling of the two models.

A few basic assumptions were inevitable, to account for the large scale of implementation, and lack of detailed data. Among them, porosity throughout the model was constant (the default value of MODFLOW, i.e. 0.3), while for the hydraulic conductivity we created four different classes, in line with the IHME classes. Even when there are available data of hydrogeological parameters for specific lithological formations within a grid cell, the user needs to aggregate the different values to a single one, which will describe the grid cell characteristics. The model parameters need to be representative of the aggregated processes, rather than of one formation, albeit the predominant one (Trichakis et al. 2016).

2.2 Available Data

High quality, previously homogenized meteorological data were available for the whole area for a long period of time (9070 days, from 1/1/1990 to 31/10/2014). These included:

  • Meteorological data (temperature, precipitation, humidity etc.) (Ntegeka et al. 2013);

  • Landcover data based on CORINE data (Batista e Silva et al. 2012);

  • Soil properties from the Harmonized World Soil Database (FAO/IIASA/ISRIC/ISSCAS/JRC 2009);

  • Digital elevation model data based on SRTM, channel network data based on Hydroshed (Lehner et al. 2006) and The Pan-European River and Catchment Database (Vogt et al. 2007).

On the contrary, the available groundwater data were limited, so we have limited the analysis of our results to the area where data were already available. In the meantime, collection and homogenization of groundwater data is in place, and we will update the results in the future, as soon as we have the homogenized groundwater parameter datasets. At the time of our first application of the model, we were able to check our simulated results against observed groundwater levels, which were available for the Danube river basin part of Bulgaria with 2911 data records at 49 locations.

3 Results-Discussion

The initial LISFLOOD model, as well as the coupled version, has the possibility to create a vast range of outputs, mainly in the form of raster maps and time-series files. In this work, we exported, as model outputs, maps, which include at each time step: hydraulic head, leakage from/to the river, river flow. There are many more other options that one can choose to include to the model outputs, depending on user needs. In our preliminary run, the most interesting outputs for our analysis were hydraulic head maps and river leakage maps. From the former, one can see the seasonal differences of groundwater levels, while from the latter, the recharge and discharge zones of the aquifer.

Figure 5 shows an example map of model output, i.e., the simulated groundwater depth in meters, as derived from the combination of the elevation and the hydraulic head maps. Excluded cells from the MODFLOW calculations are in grey. The map shows that in general, the water table is deeper at higher altitudes and in semi-arid areas like southern Spain and eastern Greece.

Fig. 5
figure 5

Example of the model output: simulated groundwater depth [m]

The model’s preliminary run did not include a calibration with observed groundwater data. A calibration of the input parameters will be the object of a future work. Instead, we selected the parameters of the model according to information from the IHME and literature values. Nevertheless, for the sake of certifying that the simulated values of the model are in agreement with the observed data, we conducted some post simulation checks. After the first full run, we compared the results with observed values from Bulgaria, which is the country, from where we had already available data.

3.1 Case Study: Bulgaria

In a first attempt, we compared the results of our approach with observed data coming from hydraulic head levels in wells in Bulgaria. The data were available from 1990 until 2014, for 49 different locations, within the Danube river basin part of the country. A statistical analysis of observed and simulated data indicated that there is a good correlation between them. Further looking into the statistical indicators, available in Table 1, the inability of the model to describe the situation precisely at a specific location becomes more obvious.

Table 1 Statistical quality indices between observed and simulated values from Bulgaria

Since elevation influences the model output to some degree, the high correlation of the observed and simulated data sets might be misleading. For this reason, we insisted more on the identification of residuals of observed vs. simulated, and their analysis to better understand where they come from and how we will be able to minimize them in the future.

The results show that there is a negative bias of almost 20 m, i.e., the model tends to underestimate the observed value. One could argue that 20 m is a large number when we speak about groundwater resources, since it is comparable with the depth of many wells. The number does not seem so large though, if we take into account the coarse setup of the model and the fact that the simulated values represent an average value of the cell, rather than a value at an exact point. Therefore, it would be wrong to say that these discrepancies are not plausible considering the fluctuations of groundwater within a 5 × 5 km area. In addition, due to lack of pumping well data or estimations, which we are currently collecting from local experts, this parameter was not included in this version of the model. This might also have caused discrepancies between observed and simulated values. On the other hand, it also shows that this model setup is not suitable to manage a single well. It is worth noting that the correlation between elevation and observed hydraulic head is high (R2 = 0.9792), but still lower than the correlation between observed and simulated hydraulic heads (R2 = 0.9851).

Figure 6 depicts the regression residuals between simulated and observed hydraulic head data vs the fitted values. Although observed data range from 5 m to 500 m, which can be considered as a large elevation range, the magnitude of the hydraulic head does not seem to influence the model. In other words, the model describes equally well low and high values of hydraulic head. The regression residuals are not correlated with the elevation or the observed hydraulic head values (R2 is 0.029 and 0.084, respectively). Even less so, if we check the absolute values of the regression residuals (R2 is 0.016 and 0.015, respectively). In other words, the points with the highest overestimation and underestimation are not located in a far end of the fitted values spectrum, something that suggests an error (probably systematic), which can be attributed to the lack of calibration so far.

Fig. 6
figure 6

Residual analysis of simulated values from the model vs. observed groundwater levels in Bulgaria

From the residual analysis in Fig. 6, we can also conclude that there is a close to normal distribution for the majority of the smaller residuals. The fact that there is a sudden increase in the residuals implies that for some wells the lack of fit was large and systematic.

Figure 7 shows the time-series of simulated and observed hydraulic head for a single well located in Bulgaria. Although the model seems to simulate the trend of the observed values, the fit is not perfect. The oscillations of the observed values are higher, even though still just a few centimeters (mean absolute value of observed 12 cm, of simulated 2 cm). We expect that a better calibration of the aquifer parameters, in particular hydraulic conductivity and specific storage, will give higher accordance with the observed data.

Fig. 7
figure 7

Observed and simulated hydraulic head [m] time series for a well in Bulgaria

4 Conclusions

Coupling a surface water and a groundwater model at large scale requires some conceptual changes to the way we normally utilize a numerical groundwater model. The large size of the grid cells requires a non-traditional approach to various aspects. From hydraulic conductivity to porosity, etc., the values have to account for the aggregated grid cell rather than to a single geological formation.

The model is able to produce, among others, maps of hydraulic heads as well as discharge and recharge zones for each time step. This can allow users to have a dynamic view of the system rather than a static one, and take into account seasonal differences. Other outputs, including maps of water table depth, are also easy to produce depending on the user needs.

For a first iteration, the simulated values have a high degree of agreement with observed ones. There are discrepancies at some specific points, which prove this resolution is not suitable for managing a single well. On the contrary, it would work well for obtaining the big picture and managing water resources at a large scale. Up to this point, the model parameters come from literature information rather than calibration. This explains a big part of the discrepancies between simulated and observed data. The simulated data also tend to have smaller monthly oscillations than the observed, which shows that the model can benefit from a calibration to be able to better simulate the physical processes. Another possible reason for these discrepancies might be the lack of pumping well data.

But it can be shown here, even at this early stage of work, that including a coupled groundwater model instead of a conceptual non-lateral groundwater zone will improve the performance of large scale hydrological model and is a necessary step towards an integral view on water resources management.

When more input data become available, we can do additional checks and expect an even higher degree of agreement. Planned future activities include the collection of data from more countries and the comparison of simulated results with the observed values. At a second stage, a calibration of model parameters based on the collected data will try to improve the model’s ability to simulate the physical system.