1 Introduction

This paper reviews the current developments in understanding and modelling groundwater-surface water interactions at the regional scale. The focus is on practical aspects of this topic, i.e. the degree to which currently available scientific knowledge can provide and support solutions to actual practical problems in the field of integrated modelling of groundwater-surface water interaction at the regional scale. The term “regional scale” is used here to describe areas of approximately 103 to 105 km2.

1.1 Coverage of Groundwater-surface Water Interaction at the Regional Scale in the Literature

Groundwater-surface water interaction (hereafter abbreviated to GW-SW) has received a lot of attention in recent decades. Many individual studies and several summary articles have been published, and the number of related publications is steadily growing. In his state of the science, Sophocleous (2002)Footnote 1 summarises the fundamental concepts and implications of GW-SW from a predominantly hydraulic-hydrogeological viewpoint. Winter and Rosenberry (1995) and Winter (1999) take a similar approach, focusing particularly on the hydraulic conditions related to various types of surface waters. Brunke and Gonser (1997) and Hayashi and Rosenberry (2002) provide a comprehensive overview with special emphasis on the ecology of surface waters. Rosenberry and LaBaugh (2008) as well as Kalbus et al. (2006) give overviews of field techniques for estimating fluxes between groundwater and surface water at different scales. Anibas et al. (2012) describe specific types of interaction in wetlands and regional modelling approaches over large wetlands, e.g. by Bauer et al. (2006). Werner et al. (2013) focus on the interaction between groundwater and sea water. There is much focus on the interaction between groundwater and rivers and streams, while interaction with lakes and other non-flowing water bodies is underrepresented (see, e.g. Elsawwaf et al. 2014; Tweed et al. 2009). Dahl et al. (2007) review different classification systems of GW-SW in a process-oriented manner and propose a new typology for these interactions, which can be used for a better comparison of different settings or sites. Levy and Xu (2012) review and compare methods to describe GW-SW at different scales and list exemplary applications in South Africa.

Research efforts relating to GW-SW have recently focused on the hyporheic zone (near-channel and in-channel processes, e.g. Allen et al. 2010; Banzhaf et al. 2013; Boulton et al. 1998; Krause et al. 2009). Much less research has been carried out at the local or flood plain scale (Langhoff et al. 2006, see also section 2.1.2). When looking at larger scales, process-based investigations become scarce; only a few studies, e.g. studies of the Murray-Darling Basin Sustainable Yields Project by CSIRO (2008) and Lamontagne et al. (2014), address the regional scale.

Looking at the literature in general, it seems that the governing hydraulic processes of GW-SW are quite well understood and knowledge of how to address specific conditions is rapidly increasing. This facilitates a conceptualisation of GW-SW at smaller scales in a large variety of settings. However, many questions still remain unanswered when dealing with heterogeneities in the aquifer and surface water channels (e.g. Fleckenstein et al. 2006). The combination of local scale heterogeneity of hydrological/hydrogeological properties with the large number of different catchment types (McDonnell and Woods 2004; Wagener et al. 2007) creates a wide variety of possible settings. This makes it difficult to transfer knowledge obtained at the point scale to the local and catchment scale.

Little material has been published that explicitly addresses the topic of GW-SW at the regional scale based on field experiments and observations or by attempting to derive concepts from fundamental, theoretical considerations (as one of the few examples see Akiyama et al. 2007). The relatively large number of publications on regional-scaled integrated model applications of GW-SW (see section 3), is usually more concerned with model specific issues than with a fundamental analysis of the GW-SW problem.

1.2 Objectives and Scope of this Review

The previous section shows that GW-SW, as a subject in general, has received a lot of attention in recent decades. The decision to present this review paper in addition to the existing literature is motivated by the general consensus that there is a great demand for integrated solutions relating to water resources (see, e.g. Kalbus et al. 2012; Savenije and Van der Zaag 2008) and a growing demand for scientifically sound approaches to managing and using water resources at the regional scale. It is only at the regional scale that environmental, economic and social problems linked to water resources can be analysed and solved in an integrated way (see Barthel 2014a; Barthel 2014b for a detailed discussion and further references). As the links between groundwater and surface water are of the utmost importance with respect to the integrated management of water resources (e.g. Winter et al. 1998; Wood et al. 2011), the question of how GW-SW at the regional scale can be studied/analysed/understood must be recognised as essential. Despite this, it seems that there is very little guidance available in scientific publications about how to approach GW-SW at the regional scale. Over recent years, the first author has put considerable effort into developing integrated models of the hydrological cycle at the regional scale (Barthel et al. 2008a; Barthel et al. 2012; Gaiser et al. 2008; Ludwig et al. 2003). These efforts ran into a multitude of problems which had not been described or solved previously (see, e.g. Götzinger et al. 2008; Wolf et al. 2008). Attempts to apply concepts that are well-established at smaller scales were not successful. The biggest problem seems to be to find a concept which balances the desired model results and the availability and distribution of data. The lack of scientific guidance leads to a discrepancy between the necessity to understand GW-SW at the regional scale and the current level and organisation of knowledge to support this understanding. A literature review that explicitly looks at the regional scale is missing.

GW-SW at the regional scale is a subject that obviously touches upon a plethora of different aspects that span a wide range of fundamental problems in hydrology. This adds to the challenge because literature that can be clearly and directly associated with the subject GW-SW at the regional scale hardly exists (see section 1.1). GW-SW at the regional scale is therefore difficult to grasp as a subject and it is not possible to cover all the different aspects involved at the same level of detail in a single journal article. In contrast to a classic review paper, which should provide a permanent benchmark for the subject area, this paper is designed as an overview with the goal of obtaining a better understanding of the subject and determining future research needs. The paper focuses mainly on the following:

  • A discussion of the special characteristics of regional scale GW-SW in comparison to smaller scales, including an attempt to define relevant processes for that scale;

  • an overview of the literature that can be related to regional scale GW-SW; and

  • an overview of tools and models available and applied to regional scale problems with a focus on GW-SW.

This paper does not review the physics-based concepts and the corresponding mathematical formulations of the GW-SW problem and the analytical or numerical solutions that may be applied, as those are sufficiently covered in the literature mentioned in section 1.1. The discussion is furthermore restricted to the water quantity related aspects of regional scale GW-SW (i.e. fluxes and volumes) and not chemistry or biology.

2 Conceptual Differences of GW-SW at Different Scales

The following section compares the essential features of GW-SW at different scales. As the use of terms describing the spatial (and temporal) scales varies greatly between different disciplines, the use within this paper is defined here:

The point scale is what we define as the smallest spatial entity that can be used to study GW-SW in the field. The local scale is the scale where the interaction between one stream/river reach and the adjacent (alluvial) aquifer can be studied. The sub-catchment or small catchment scale refers to study areas which encompass the entire watershed of a small catchment. The term regional scale finally is used for catchments of 103 to 105 km2 as stated at the beginning of this paper.

2.1 Characteristics of the Point Scale

The essential characteristic of the point scale (or plot-scale; see Figure 1a), is that the fundamental physico-bio-chemical processes in the pore space, at the boundary between surface water and aquifer can be observed in detail. Therefore, quantitative process descriptions based on the elementary laws of fluid mechanics are feasible at this scale. The main drivers of GW-SW are pressure gradients within and between the groundwater and surface water bodies in question. Significant properties determining the interaction are pore size and distribution, pore geometry, and connectivity of the aquifer substratum and riverbed (see, e.g. Woessner 2000). While very detailed observations are possible, it has to be noted that such observations are usually restricted to the areas shown in pale boxes in Figure 1a. Within this restricted domain of interest, elementary processes can be adequately observed and described, but it is not possible to take into account the entirety of processes that take place in the aquifer as a whole or within the entire surface water body involved. Processes that take place outside these white areas have to be regarded as external processes and to be described as boundary conditions of the observed system.

Fig. 1
figure 1

Schematic representation of a the point scale (or plot-scale). Only influent conditions are shown, b the local scale (or reach scale), c the sub-catchment scale (or small catchment scale). The system may include different surface water reaches, different hydrostratigraphic units and different types of aquifers

The situation at the point scale can be used to demonstrate one of the most central concepts of GW-SW, i.e. quantification of the exchange fluxes between surface water and groundwater using the difference in head and a coefficient that describes the conductance of the bottom layer of the river. The relatively simple and straightforward concept behind this has been extensively described and discussed in the literature (see, e.g. Sophocleous 2002). In brief, the concept builds on Darcy’s Law. The flow q across the interface between groundwater and surface water is expressed as the product of a constant representing the streambed leakage coefficient and the difference in head. This concept is widely used to describe GW-SW in many modelling approaches (see section 3.2). While being apparently straightforward, the concept has been criticised, as the streambed leakage coefficient, which is typically not measurable in the field, can change drastically over time and a clear interface does not necessarily always exist (e.g. Kollet and Maxwell 2006). Therefore it is usually determined by inverse modelling together with other parameters, e.g. the hydraulic conductivity of adjacent aquifers or, in some cases, groundwater recharge (Carrera et al. 2005; Hill and Tiedeman 2007). A calibrated modelling parameter which can hardly be verified or even constrained by field observations represents a considerable weakness of mechanistic modelling concepts. It is, therefore, seen as an advantage of the so-called fully-coupled or physics based modelling concepts (see section 3.1) that they do not rely on such an exchange coefficient (e.g. Brunner and Simmons 2012). When it comes to larger scales, the concept loses some of its relevance, as the focus of interest steadily moves away from the riverbed and processes in the adjacent areas gain more and more importance (see the following sections).

2.2 Characteristics of the Local Scale

At the local scale (or reach scale, Figure 1b), a larger cross-section of the river with its floodplain and adjacent geological units can be included. The main difference, compared to the point scale, is that the fundamental processes can no longer be described in a discrete way as it is no longer possible to collect the required number of observations. It should, however, not be forgotten that this is a merely practical limitation and that this does not mean that the fundamental processes as such change or lose their relevance (see section 2.5). However, in practice it means that effective parameters are required and, thus, mathematical descriptions require some form of aggregation or generalisation. In particular the concept of GW-SW controlled by head in surface water and groundwater and a stream bed leakage coefficient (see section 2.1) become more challenging to apply as all the parameters involved are known to show large spatial (and temporal) heterogeneity (see, e.g. Fleckenstein et al. 2006; Irvine et al. 2012).

On the other hand, it is still not possible to calculate closed balances for the groundwater and surface water bodies involved. Usually, neither entire gauged surface water catchments nor the whole extent of the groundwater aquifer in question can be studied at this scale. Thus, processes that occur outside the immediate area of interest, but which still have to be considered as drivers of the system (runoff formation, groundwater recharge, regional groundwater flow, etc.) have to be represented as boundary conditions of some sort. The main challenges at this scale, i.e. areas too large to allow discrete project descriptions and too small to study the hydrological system (catchment) in its entirety, are well known and described in the literature, as it is this local or reach scale that receives by far the most attention from scientists.

Looking at the relevance of processes, properties and conditions, it can be inferred from a comparison of Figure 1a) and Figure 1b) that processes within the entire cross section of the alluvial aquifer, exchange fluxes between different aquifers, land surface processes and unsaturated zone (UZ) processes have to be added to the immediate exchange between aquifer and river at the interface. On the other hand, it can be argued that these processes have the same influence at the point scale as at the reach scale – they just lie outside the area of interest and might be included as boundary conditions. See section 2.5 for an extended discussion.

2.3 Characteristics of the Sub-catchment Scale

The sub-catchment scale (Figure 1c), covers the entire drainage area of a given point in a stream (gauge). It is usually assumed that this allows the calculation of a closed balance for the catchment, with the discharge at the catchment outlet (gauge) as an integral measure of control. It should be noted that, from a groundwater perspective, fluxes across the catchment boundaries are always possible, apart from at the continental scale. However, most hydrological applications are, to a large degree, based on the assumption that a catchment is a closed system.

In contrast to the local scale Figure 1b, GW-SW must focus on several different streams (tributaries) that might be connected to different aquifer formations. A catchment of this size will often allow a more comprehensive consideration of the subsurface system, i.e. to take into account the existence of different geological formations, vertical differentiation (stratigraphy, tectonic structures), and their interplay with the relief. In populated areas, the characteristics and consequences of anthropogenic interference (hydraulic structures, groundwater and surface water withdrawal) are significant. Mathematical descriptions of processes in catchments need to become increasingly aggregated, i.e. even larger areas need to be regarded as homogeneous entities.

The increasing number of possible constellations (different tributaries, different aquifers, different GW-SW mechanisms) leads to increasing complexity of the system. Driving forces of GW-SW can be far away from the actual stream–aquifer interface. Again, it could be argued that this changes the relevance and importance of different processes. See section 2.5 for an extended discussion.

2.4 Characteristics of the Regional Scale

Figure 2 shows, as an example to demonstrate the differences to the scales shown previously in Figure 1, the relief, river network and geology of the Neckar Catchment, Germany.

Fig. 2
figure 2

Heterogeneities of relevant surface- and subsurface features at the regional scale (or catchment scale), using the Neckar catchment (14,000 km2) as an example (data provided by the state geological survey and environmental agency of the federal state of Baden Württemberg, Germany). Note the almost negligible extend of alluvial formations

The characteristics of the regional scaled catchments are determined by a range of possible combinations of climate, geomorphology, geology, landscape types, and biological factors that can occur in parallel within the same study area (Dahl et al. 2007; Harvey and Bencala 1993; Larkin and Sharp 1992; Sophocleous 2002; Winter 1999). Any enlargement of the study area size will increase the variety of combinations of these factors and thus lead to an increase in complexity. Even in areas with otherwise exceptionally good data – such as in central Europe – observations are usually focused in areas of particular interest, such as densely populated or intensely cultivated areas. Consequently, data are patchy in space, time, and with regard to the number and choice of observed parameters (Candela et al. 2014). An additional issue is the heterogeneity of the data, which in many cases has been collected by different agencies, with different objectives, and over different periods and is consequently very inconsistent. Many authors define the relative scarcity of data as the main challenge for regional scale work (Candela et al. 2014; Refsgaard et al. 2010; Zhou and Li 2011). Looking at long time scales, it is often very difficult to judge whether data have been influenced by human activity (e.g. hydraulic structures, land use changes) or not.

One might argue that heterogeneity occurs at all scales and is thus not characteristic only of the regional scale. Whether or not this is true or relevant is probably primarily a matter of perspective and context. In the context of the present paper, we assume that a system, composed of many differing subsystems, is more complex and heterogeneous than each of its individual subsystems alone. The same is assumed for the patchiness of data (observations clustered in areas of special interest).

In summary, the regional scale (or catchment scale) differs mainly in size of the study area from the sub-catchment scale. Typical (yet not always realised) characteristics in contrast to the sub-catchment scale are:

  • Regional catchments can be subdivided into a number of gauged sub-catchments or –basins;

  • The heterogeneity of landscapes, relief, climatic conditions, and geological units within the study area increases significantly; and

  • Socio-economic and technical aspects become increasingly important. Most regional catchments will include managed hydraulic structures, and modified discharge networks. Water transfers from one catchment to another (through water supply, waste water, irrigation networks, crops, etc.) can play an important role. Source-sink / supplier-consumer relations etc. can be captured in their entirety.

To exemplify this kind of situation using a real world example, Figure 2 shows maps of the Neckar catchment in Germany, where GW-SW was intensively studied in integrated modelling projects (Barthel et al. 2008a; Götzinger and Bardossy 2007; Götzinger et al. 2008, see also section 3.3). It is obvious that in such a large heterogeneous catchment, various processes with strong interdependencies function at different spatial and temporal scales. As a result, GW-SW also becomes a spatially and temporally heterogeneous set of processes involving many aquifers and many networked surface water bodies. The likelihood that the type of GW-SW (influent or effluent; with or without full hydraulic contact) changes frequently both in space and time increases drastically from a local to a regional system. As already discussed for the sub-catchment scale, GW-SW at this scale is no longer a ‘one river-reach ↔ one aquifer’ phenomenon, but encompasses the entire catchment and a large number of different processes. In Figure 2, please note that the area covered by alluvial sediments, i.e. those geological formations typically considered when GW-SW is studied at local scales, is almost negligible in comparison to the geological formations that form regional aquifers. This means that regions far away from the actual interface between GW and SW must have a major impact on the actual exchange processes. Please also note that the geology affects the density of rivers and streams (less dense in karstic limestone areas) i.e. the way surface and subsurface processes interact.

2.5 GW-SW Related Processes at Different Scales

Scale dependencies of hydrological processes and the possibility of transferring properties, process descriptions and model parameters from one scale to another (up- and downscaling, regionalisation) have been intensively studied in recent decades. Upscaling, i.e. methods to make measurements, process descriptions, or model parameters identified at the local scale available for use at larger scales, has received wide attention in the hydrological literature (Becker and Braun 1999; Farmer 2002; Neuman and Di Federico 2003; Renard and de Marsily 1997; Sánchez-Vila et al. 1995). The same is true for regionalisation, i.e. the generalisation of data or model parameters obtained from distinct points or small spatial entities to larger areas (Diekkrüger et al. 1999; Kleeberg et al. 1999; Parajka et al. 2005; Samaniego et al. 2010). Few authors, however, deal explicitly with the scale dependency of GW-SW (CSIRO 2008; Dahl et al. 2007; Kollet and Maxwell 2008; Levy and Xu 2012). Most works dealing with scale dependency are limited to spatial extents that are significantly smaller than regions within the meaning of this paper. In consequence, there are few insights to be gained from this work and it remains largely undecided how to transfer local scale process knowledge to the regional scale.

It has frequently been argued in the literature that processes that are relevant at small scales might become irrelevant at larger scales (see, e.g. Bronstert et al. 2005). Kirchner (2006) and McDonnell et al. (2007) suggest that the governing equations that apply in small scale physics might not adequately describe large scale hydrological responses in heterogeneous systems. Several authors have tried to further conceptualise and formalise the different significance and relevance at different scales as well as in different environments, landscapes and climates and to compile this into a theory of dominant processes (Grayson and Bloeschl 2000; Sivakumar 2004; Sivakumar 2008; Sivapalan et al. 2003). The question of whether or not processes loose or gain relevancy at the large scale was already been raised in the preceding sections. Based on a simple descriptive comparison of the systems at different scales, the following three general statements can be made.

  1. 1.

    Processes that take place at the interface between groundwater and surface water (fluxes across the riverbed) determine, to a large degree, the nature and magnitude of GW-SW. These processes can be observed in detail at small scales. When moving to a larger scale, these processes take place unchanged but there are strong practical limits to observing them.

  2. 2.

    Processes that can be regarded as driving forces of GW-SW (i.e. causing head changes in aquifers and river reaches) are not restricted to the immediate interface between groundwater and surface water. On the contrary, they can be steered by processes and properties of materials far away. These processes can usually only be observed in their entirety when looking at larger scales, yet they do not lose any of their relevance at small scales.

  3. 3.

    For practical reasons, studying GW-SW at small scales means detailed observations in a small area and a simplification of the processes that occur outside this area. Studying GW-SW at large scales, means using observations with larger distances between them and a simplification of the processes at the immediate interface of groundwater and surface water.

Together, these three statements indicate that the question of process-relevance might not be something that can be generalised, but rather is a question of perspective and feasibility.

3 Modelling GW-SW at the Regional Scale

Despite the fact that regional-scale GW-SW is not often addressed explicitly as a research topic on its own, it implicitly receives a lot of attention through integrated models (e.g. Gaukroger and Werner 2011; Jolly and Rassam 2009; Rossman and Zlotnik 2013; Sebben et al. 2013). "Integrated modelling” often has a much wider focus than GW-SW. It may include coupling to atmospheric models, plant growth models, socio-economic models, etc. and the actual representation of water resources can be addressed in very different ways. This makes it somewhat difficult to extract the specific GW-SW related aspects.

In general, the integration of GW-SW into wider models can be categorised according to the following characteristics:

  1. 1.

    The number of processes and elements of the hydrological cycle that are included in the integrated system;

  2. 2.

    The type of conceptual/mathematical representation of such processes and elements (e.g. flow in rivers represented as physically-based 2D open channel flow versus simple conceptual routing);

  3. 3.

    The degree of linkage between the different processes and elements of the hydrological cycle (fully coupled equations, interfaces between separate modules, etc.);

  4. 4.

    The nature and type of the model components and process descriptions involved (numerical, conceptual, lumped, distributed, etc.), including questions about which processes are explicitly modelled and which are represented as boundary conditions;

  5. 5.

    Temporal aspects of model discretisation and model coupling: parallel, sequentially, uniform or different time steps and, more generally, whether the calculations are steady state or transient; and

  6. 6.

    The objectives, problem setting and focus of interest, including issues such as data availability (e.g. “ungauged basins”).

The possible number of combinations of all these characteristics is huge, making it nearly impossible to address the subject in a systematic way. Overviews describing different coupling strategies are provided by Barthel et al. (2008a), Ebel et al. (2009), Furman (2008), Kollet and Maxwell (2006), Markstrom et al. (2008), Levy and Xu (2012), Rossman and Zlotnik (2013), Sebben et al. (2013), and Spanoudaki et al. (2009).

From the large number of potential classification schemes indicated by the list above, we chose to categorise integrated models according to the coupling scheme only. We thus distinguish between:

  • Fully coupled schemes: equations governing surface and subsurface flows are solved simultaneously within one software package;

  • Loosely coupled schemes: two or more individual models are coupled via the exchange of model results, where the output of one model forms the input of the other.

The loosely coupled schemes are further subdivided into:

  • “ready to use” software packages which contain two or more individual model components embedded in a common framework;

  • Loose coupling on an individual, less standardised basis (often developed and used in only a single context).

The boundaries between these classes, in particular of the two subdivisions defined for loosely coupled schemes, are rather transient and mixed types can be identified.

The following sections summarise very briefly the main features of the aforementioned categories, and list the main software packages and applications in each class. Thus, the selection of software packages and applications focuses on those that can have a more or less clear relationship to the regional scale, as it is defined in this article (i.e. designed to be used at regional scales or actually applied to larger scales). For each software package, we tried to identify model applications in model domains between 103 and 105 km2 in size. If none or few such were present, we also included model domains between 102 and 103 km or much larger domains. Research was identified from the ScopusFootnote 2 database, using the following search phrase: ( TITLE-ABS-KEY ( [model name] ) AND TITLE-ABS-KEY ( regional ) OR TITLE-ABS-KEY ( catchment ) OR TITLE-ABS-KEY ( meso*scale ) ) . In addition, we followed references and hints to software packages and applications within the literature identified.

3.1 Fully Coupled Schemes

Fully coupled modelling schemes, sometimes also referred to as “physics-based models” (see, e.g. Loague et al. 2006), have received wide attention and significant progress has been made with their development in recent years (Gaukroger and Werner 2011; Maxwell et al. 2014). The software packages mentioned most often in the literature include ParFlow (Kollet and Maxwell 2006), HydroGeoSphere (Therrien et al. 2009), InHM (VanderKwaak 1999), and OpenGeoSys (Kolditz et al. 2012). More examples are described in Sebben et al. (2013), Shi et al. (2013), Partington et al. (2013), Maxwell et al. (2014) and Bronstert et al. (2005). These fully coupled schemes have in common the attempt to achieve a physics-based description of all processes involved in the saturated zone, unsaturated zone and surface waters and thus generally avoid implementing interfaces between separate model modules (e.g. Brunner and Simmons 2012). In this way, they eliminate the boundaries between the traditional “compartments” (to varying degrees) and avoid a large number of problems that are related to using different concepts and software packages for different compartments. For example, HydroGeoSphere allows for stream/surface drainage network genesis, i.e. a river will form naturally within the model and interact with the groundwater in a physically based way. There is no need to predefine the river’s boundaries or its hydraulic head, which can be regarded as an outstanding advantage as it avoids the problems related to the riverbed-conductance concept (see section 2.1). In general, the focus shifts away from the traditional “flux through the river bottom interface” type of conceptualisation and becomes part of a holistic description of the water cycle.

Several model inter-comparisons for fully coupled software packages are ongoing (see, e.g. Delfs et al. 2012; IHMI Workshop 2011; Maxwell et al. 2014). The problems used for comparison encompass rather small areas and are built on synthetic test cases (Pryet et al. 2014). Maxwell et al. (2014) report comparisons of larger and more complex cases. The performance of different fully coupled software packages was also investigated by Sebben et al. (2013). They found that a lack of test cases and limited options for evaluating model performance (mainly river discharge only) hinder a meaningful comparison. It is common to use only river discharge (and not groundwater heads) to evaluate model performance of integrated models (Hattermann et al. 2004; Sebben et al. 2013). Table 1 shows an overview of fully coupled models and their application to catchments within the range defined as regional in this article.

Table 1 Overview of fully coupled schemes and their largest-scale applications

An additional software package which is quite frequently mentioned in the literature is InHM (Blum et al. 2002; Jones et al. 2006 ; VanderKwaak 1999 ). However, to the knowledge of the authors, this has only been applied to very small catchments between 0.001 and 0.1 km2 so far (Blum et al. 2002; Jones et al. 2006; Mirus and Loague 2013; VanderKwaak and Loague 2001). Guay et al. (2013) and Semenova and Beven (2015) list even more examples of models from the fully coupled modelling domain.

3.2 Loosely Coupled Schemes

While fully coupled, physics-based modelling concepts and software packages are relatively easy to identify and describe, as they are “monolithic” by definition, it is more difficult to evaluate loosely coupled schemes, which consist of two or more independent model software packages. Apart from the fact that a large variety of model types (hydrologic, hydraulic, numerical, conceptual, 2D/3D, lumped, distributed, etc.) can be involved, these model codes can be coupled in very different ways with respect to the exchange parameters and various spatial and temporal aspects of coupling. Many such loosely integrated schemes are unique, i.e. only developed and used by specific groups and/or in specific spatial or problem contexts. Other coupling schemes are more standardised and are used by larger communities in varying contexts. It should be mentioned that in loosely coupled schemes, coupling can be “strong”, i.e. the exchange is spatially and temporally explicit for the model time, which allows the representation of feedbacks (Bronstert et al. 2005; Furman 2008). Alternatively, coupling can be “weak”, i.e. models are run completely consecutively. Hybrid forms also occur.

3.2.1 Ready-made Model Packages for GW-SW

Overviews of ready-made model packages for GW-SW (primarily addressing loosely coupled schemes) have been compiled by CDM (2001), Blum et al. (2002), Levy and Xu (2012), Bobba (2012), Alaghmand et al. (2013), and Sebben et al. (2013). These overviews, however, do not directly address applicability or applications at the regional scale.

Most loosely coupled schemes are, in one way or another, based on MODFLOW (Harbaugh 2005; McDonald and Harbaugh 1988), which represents the groundwater compartment and its various options represent surface water related boundary conditions. In principle, MODFLOW calculates the fluxes across the boundary between aquifer and surface water based on the difference in hydraulic head, an exchange coefficient representing the hydraulic conductivity of the river bottom and geometric parameters of the interface. This can be done using different modules, e.g. the river package, drain package, stream flow routing package or the general head package (Prudic et al. 2004). Additions and enhancements to this basic scheme are frequently published (Barlow and Harbaugh 2006). Besides the MODFLOW-based, ready-made software packages, other mainly commercial software packages, are available that follow a similar approach to solve equations for groundwater flow, surface water run-off and unsaturated flow independently and couple the processes via the exchange of results through boundaries. In most ready-made packages, coupling is strong.

In addition to the software packages listed in Table 2 more packages are mentioned in the literature. Again, many are based on MODFLOW or can, in some way, be linked or coupled to it, e.g. HEC-RAS (Rodriguez et al. 2008), DAFLOW (Jobson 1989), HEC-HMS (Scharffenberg and Fleming 2010), or MD-SWAT-MODFLOW (Ke 2013). Non-MODFLOW based packages include HMS (CDM 2001; Yu et al. 1999), DYNSYSTEM (CDM 2015), and IGSM (LaBolle et al. 2003; Watson 1993). Regional applications of these have not, to the knowledge of the authors, been published.

Table 2 Overview of loosely coupled schemes (ready-to-use software packages and modules)

3.2.2 Other Applications of Loosely Coupled Schemes with Applications at the Regional Scale

The loosely coupled model software packages described in the previous section include, in general, fully developed coupling schemes, i.e. they can immediately be applied for simulating GW-SW without the need for a new user to develop their own interfaces, etc. In addition to such ready-to-use software packages, there are several examples in which previously independent groundwater and surface water models have been coupled within a specific problem context. Concepts for context specific integration of two or more standalone models (GW, SW, UZ) are hard to review in a systematic way as there are so many unique combinations. It is also difficult to decide where to put the boundary between the coupling of models of groundwater and surface water and the addition of a module that represents the other compartments relative to a groundwater or surface water model, not necessarily taking into account the actual processes. Many such models seem to exist (e.g. Feyen 2005) but not all of them have necessarily been published in peer-reviewed literature. They are often developed by national agencies and use existing standalone models for GW and SW as a basis. Table 3 lists those model applications that could be identified from journal literature and clearly fall into the range we defined as regional (> 1,000 km2).

Table 3 Overview of schemes coupling GW and SW model software packages on an individual basis, applications to regions between 103 and 105 km2

In addition to the examples listed in Table 3, more packages are mentioned in the literature but it remains unclear how relevant they are or could be. Gilfedder et al. (2012), e.g. describe GWlag. This is a conceptual model for improving water resource decision-making by connecting surface water and groundwater characteristics and their interactions with land-use changes, which was applied in the Tarcutta Creek catchment, a part of the Murrumbidgee River catchment in New South Wales, Australia (1,630 km2). SWIM is a modified version of SWAT (Hattermann et al. 2004), with a groundwater module that only models groundwater levels at a spatial resolution of sub-catchments. A modified version has been applied to the Elbe-catchment in Germany (see Table 3).

3.3 Summary of Modelling Tools and Regional Applications

In total, we identified 17 (25Footnote 3) ready-to-use software packages (see Tables 1-3) to integrate GW and SW that were applied at the regional scale or could potentially be applied at the regional scale. Four (5) of these are fully coupled systems. The fully coupled software packages are quite similar with respect to functionality and features, while the loosely coupled systems differ with respect to a huge range of different aspects.

We identified 25 applications of regional models in areas between 1,000 and 100,000 km2. Five of these employed a fully coupled system, 13 utilised ready-to-use packages, 7 employed more individual solutions. In addition there are 16 regions between 100 and 1,000 km2 that could be considered “small regional scale”, whereof 12 made use of loosely coupled schemes.

In summary, around 30 software packages or more individually coupled modelling schemes were developed that claim to be applicable at the catchment or regional scale. These were applied to around 40 model applications covering regions between 100 and 100,000 km2. It is easy to conclude that this means that most models where not used in more than one regional scale catchment. It is also clear that none of the catchments was modelled with more than one modelling scheme.

It is probably very easy to overlook ongoing and completed integrated modelling activities, in particular in the field of loosely coupled, context specific modelling schemes, when focusing on journal literature only. It may be that such regional integrated models are presented more often in conference contributions and reports than in the easily accessible journal literature. This is confirmed to some extent by Wood (2012): "…. I suspect that the sheer size and data complexity of these integrated models with their voluminous outputs might (make) them difficult to publish in traditionalwhiteorgreyhydrogeological literature.” A more detailed analysis of this issue was provided by Burell (2008), who claims that integrated models cannot be adequately published in journal literature at all.

Nevertheless, the number of regional applications of coupled groundwater and surface water models is far lower than expected considering the importance of the topic and in view of the large number of publications dealing with integrated models of groundwater and surface water. In particular in the groundwater field, a lack of peer-reviewed scientific publications pertaining to regional scale studies and models can be observed (Barthel, 2014a). Moreover, the often cited trend towards more integration has not yet led to the wide use of integrated models. For example, Rossman and Zlotnik (2013), reviewed 88 regional groundwater-flow modelling applications from the US and found that only 7% of them included any attempt at all to couple groundwater and surface water. Despite the low numbers, it is promising that most regional scale integrated models were published within the last 5 years. This might indicate a trend and mean that even more work is in progress.

3.3.1 Application Potential and Limitations of Fully Coupled Software Packages

The functionality of most of the fully coupled systems is overwhelming and it seems that they provide answers to almost any question we might have about water resources (see, e.g. Brunner and Simmons 2012). The usefulness of the fully coupled systems at the regional scale is quite often explicitly pointed out. On the other hand, few regional applications have been published and comparisons to other modelling approaches applied to the same area are missing. The viability of the models has primarily been proven in various test applications (Kollet et al. 2010; Sebben et al. 2013) and proof of applicability in practical management still has to be provided (Miller et al. 2013). So far, the application of fully coupled software packages seems to be largely restricted to academic studies, mainly at smaller scales and at specific test sites. Harter and Morel-Seytoux (2013) presented an evaluation of HGS’ applicability to (regional) management problems and concluded that its present use is mainly academic and they expressed doubts about the validity of the mathematical representation at large scales. According to Harter and Morel-Seytoux (2013), both issues are common to many aspects of modern soil physics and fully coupled models (see also Semenova and Beven 2015). This might be a result of the huge computational cost, which has only recently been met by the availability of powerful parallel computing systems to a wider public. And yet it should be noted that these model concepts are relatively new, indicated by a sharp increase in related publications since around 2005. Almost all model applications in regions larger than 100 km2 were published after 2010, the majority in 2014.

3.3.2 Application Potential and Limitations of Loosely Coupled Schemes

Numerous loosely coupled schemes have been developed for the regional scale, but the number of actual published regional applications of these is low. Looking at the examples published in the scientific literature it seems that each individual scheme is applied only a few times or even just once (by one group of researchers, in one catchment). Extracting general findings or drawing conclusions from comparisons is therefore hardly possible.

It is apparent that the majority of regional scale models addressing GW-SW make use of either traditional hydrological (rainfall runoff in the widest sense) or groundwater flow models and represent the other part by relatively simple, conceptual descriptions (see also Fleckenstein et al. 2006; Hattermann et al. 2004). It is quite common that regional integrated models are based on surface water models rather than on groundwater models (Werner et al. 2006). Groundwater-centred integrated models most often employ MODFLOW, using a more or less complex solution for surface water discharge and soil moisture to provide input for the chosen boundary condition packages. From the loosely coupled systems with a full representation of both surface and groundwater, it appears that MIKE SHE and more recently FEFLOW coupled with MIKE11 are most often used for regional studies. The respective applications are more often driven by practical management questions than by mere scientific interest – a noteworthy aspect when it comes to laying out future pathways for integrated hydrological research.

The problems associated with such loosely coupled schemes stem from the fact that hydrological models typically do not calculate river stages at any specific point along a stretch of river. To obtain the information needed for calculating exchange fluxes based on pressure differences, interpolation of river water levels between gauges or other simple and pragmatic solutions are required. Another issue is that river bottom elevations above a common datum (sea level) are quite often not available for all potentially relevant river reaches. Hence, channel bottom elevations have to be fitted or derived from proxy data (Scibek et al. 2007; Wolf et al. 2008). This leads to a general problem within loosely coupled systems: GW-SW in such systems has to be based on the calculation of the exchange fluxes as a result of potential differences, the geometry of the river bed, and its hydraulic properties. In one way or another, most of the loosely coupled schemes apply the critical riverbed-conductance concept (see section 2.1).

It is uncommon to report on failure and deficits when writing about models, either in peer reviewed scientific literature or in grey literature. This makes it hard to evaluate the true potential of an approach. Our own experiences from the GLOWA-Danube project for the Upper Danube catchment (77,000 km², Barthel et al. 2012; Ludwig et al. 2003) and in the RiverTwin project for the Neckar catchment (14,000 km², Barthel et al. 2008a; Gaiser et al. 2008) in Germany gave deeper insights of the tremendous challenges associated with integrated modelling at the regional scale. Some of these difficulties have been discussed in detail in previous publications (Barthel 2006; Barthel et al. 2008a; Götzinger et al. 2008; Rojanschi et al. 2006; Wolf et al. 2008). For example, huge problems arise when groundwater recharge, calculated by conceptual hydrological models is applied to physics-based numerical groundwater flow models because the spatial distribution of this recharge does not take into account the geology of regional aquifers (discussion in Barthel 2006; Jie et al. 2011; Wolf et al. 2008). But most importantly, it became evident that data availability is far from sufficient, e.g. with respect to parameterising the exchange terms in MODFLOW. This is worrying as the Neckar and the Upper Danube catchment may be among the most intensively monitored regional-scale-catchments in the world.

3.4 Regional Integrated Modelling in View of General Challenges of Hydrological Modelling

It was stated at the beginning of this paper that the subject “GW-SW” is one that is strongly connected to many of the big challenges of hydrological sciences:

  • How should one deal with uncertainty associated with data, models and predictions in the context of integration and stakeholder demands (e.g. Castelletti et al. 2008; Li et al. 2011)?

  • Should models be as simple or as complex as possible (see, e.g. Beven and Cloke 2012; Wood et al. 2011; Wood et al. 2012)?

  • How should we deal with heterogeneity, how do we scale up processes, properties and model parameters (see, e.g. Bárdossy and Singh 2011; de Marsily et al. 2005; Fleckenstein et al. 2006; Götzinger and Bardossy 2007; McDonnell et al. 2007; Nœtinger et al. 2005; Samaniego et al. 2010; Vermeulen et al. 2006)?

These questions are relevant for all sorts of hydrological problems and at all scales, but they are much more pronounced for regional scale GW-SW (Barthel 2014a). Discussing these issues on the general and abstract level of an overview as presented in this article is unfortunately impossible.

4 Discussion

The objective of this paper was to review the scientific literature that provides guidance on how to analyse, describe, understand, and finally model GW-SW meaningfully at the regional scale. The results of this evaluation are not very encouraging: knowledge of the topic is scattered and often difficult to identify. It is clear that a large body of literature related to GW-SW in general exists. A large amount of information is available about GW-SW at small scales and the number of studies in this field, both experimental and modelling, is rapidly increasing. Unfortunately, little guidance is available on how to apply the knowledge gained from these activities at the regional scale as defined in this article. The specific issue of “GW-SW at the regional scale” is hardly addressed explicitly at all.

It seems that part of the problem is defining what exactly ‘GW-SW at the regional scale’ is. While at the point or local scale it is easy to describe GW-SW as a process with a defined location, direction, and driving forces, it remains unclear if the same is possible at the regional scale (compare Figure 1 and Figure 2). GW-SW at the regional scale is often no longer a “process” of exchange between one groundwater and one surface water body as it usually is at local scales. GW-SW at the regional scale can be regarded as the result of the combination of all processes in a regional catchment or any other appropriately sized area of interest. How relevant these processes are depends on the regional setting and thus on a combination of rather different factors. But, more importantly, the relevance of different processes needs to be defined in relation to the problem setting, data availability and other practical constraints and demands. It is thus probably necessary to frame the definition of GW-SW at the regional scale more widely than at the local scale. GW-SW at the regional scale encompasses the entire terrestrial hydrological cycle, all the processes leading to changes of pressure, saturation (concentrations) within and even outside the area of interest.

The best source of information about GW-SW at the regional scale is literature on integrated modelling (see section 3). A large variety of modelling concepts capable of representing GW-SW has been published. Many of those concepts have the potential for application at the regional scale, yet real regional scale applications are still rare. This makes it difficult to assess the validity and appropriateness of the published approaches. The available knowledge remains scattered and difficult to use. It may be seen as a problem that regional, integrated modelling efforts are normally created by government agencies, which often have lower scientific ambitions than scientists from academia. Therefore much of the work carried out in the field may remain unpublished (see Barthel 2014a). Additionally, publishing large integrated modelling efforts is difficult within the constraints of a journal article (Burell 2008; Wood 2012).

If GW-SW at the regional scale is essentially regarded as the sum of all hydrological processes in a region/catchment, then the most appropriate way to address this seems to be the use of fully coupled, physics-based models. Those models actually attempt the required holistic description of the hydrological cycle and have the power to connect various processes over a range of spatial and temporal scales. However, while they seem to have the potential to solve all the related problems, it still has to be demonstrated that they are valid and applicable to practical regional management problems. Most authors agree that regional scale integrated modelling is constrained by data availability and that fully coupled models are usually not advantageous when combined with limited data (e.g. Brunner et al. 2010; Semenova and Beven 2015). Therefore, despite the attractiveness of fully coupled schemes, coupling relatively simple models using relatively simple coupling schemes may still provide a suitable approach even if this means severe oversimplification of complex regional systems. Currently neither of the parties advocating more or less complexity seems to be able to prove that their approach is better. Systematic comparisons using several alternative approaches are missing for the regional scale and thus advantages and disadvantages of one approach over another remain unknown.

It is not possible to determine the most suitable strategy for regional scale integrated modelling if the discussion focuses only on the scientific viewpoint. Even if the scientific community desires general, applicable, transferable approaches to deal with GW-SW at the regional scale, it might not be feasible to define such an approach independent of a practical management context. Thus, defining the management problem that needs to be solved and at what level of accuracy (special and temporal) is a necessary step, along with determining the degree of uncertainty that is acceptable. This means that approaches need to be context-specific and it is always necessary to define them together with stakeholders and end-users. In such a context, the quantitative performance criteria usually applied within the scientific community (the Nash-Sutcliff coefficient etc.) or approaches to quantifying uncertainty remain rather abstract, difficult to communicate concepts that do not explain the usefulness and applicability of a model and its results for answering practical management questions. Regional models might thus have to be developed from a different perspective: either merely driven by unique, context- and scale-specific demands within the area of practical water resources management or by developing integrated regional models for the sole purpose of providing a regional framework for nested local solutions. Participatory and transdisciplinary approaches may be more helpful in the attempt to provide meaningful regional solutions. Instead of asking how we can fundamentally identify, understand, describe, and model all relevant processes at the regional scale, we may have to ask what the nature of the result of our models should be, in order to meet the requirements of practical management at the regional scale. In the two regional integrated projects that formed the starting point and stimulus to write this paper an intensive dialogue with stakeholders and potential end-users was carried out. It has frequently been pointed out that a gap exists between science and practice and that scientists are not creating the knowledge that practice (society) actually needs, thus explaining the low confidence of practitioners in models developed by scientists (e.g. Argent et al. 1999; Borowski and Hare 2007; Brugnach et al. 2007; de Kok and Wind 2003; Lerner et al. 2011; Olsson and Andersson 2007).

5 Concluding Remarks

The review of the literature that deals with GW-SW at the regional scale showed that there are very few studies that address this topic directly. Field experiments and studies based on regional monitoring of both GW and SW are largely missing, as are fundamental theoretical considerations on how to address the problem. Knowledge of how to examine GW-SW at the regional scale is mainly derived from studies carried out at local scales, without a clear theory of how “upscaling” should be performed (Sebben et al. 2013). There is hardly any evidence that this approach is appropriate. It is frequently mentioned that the relevance (dominance) of processes might be different at different scales, but there is no clear quantitative proof of this with respect to GW-SW. In general, existing knowledge pertaining to GW-SW at the regional scale is still very scattered and distributed over a wide range of different research fields. A large number of modelling concepts that are intended to be used for the integrated modelling of groundwater and surface water at the regional scale have been published. Many of them, however, have not been applied at the regional scale as defined in this article. To the knowledge of the authors, there is no single case study were two or more fundamentally different modelling concepts have been tested in the same catchment and many modelling concepts have not been applied in more than one catchment. A systematic comparison is thus impossible. On the basis of peer-reviewed journal literature, it is not possible to decide which approaches are feasible, suitable and appropriate for integrated regional modelling under specific conditions (e.g. a given complexity of geology and a given level of data availability). The few regional scale integrated models described in the literature are all very specific, adapted to the specific conditions of the region and the relevant publications focus on these specific aspects rather than on generic findings.

Perhaps when looking for solutions at the regional scale, one should refer to conference proceedings, agency reports and software manuals. This may be true, but cannot lead to a satisfying result from a scientific perspective. It is an unfortunate situation for the research community if, as suggested by (Wood 2012), scientific results in this important subject area are mainly being published outside the peer-reviewed scientific literature.