1 Introduction

Numerical simulation models in coastal and marine science are typically set up and applied to evaluate large-scale or complex physical processes over interacting time scales. These models are calibrated by adjusting a number of independent parameters to obtain a match between field data and the simulated variables such as suspended sediment concentration (Oreskes et al. 1994). The results of such environmental modelling studies often provide the scientific basis for policy decisions. In order to more accurately translate model outcomes into policy decisions, there is a need to better quantify model uncertainty (Uusitalo et al. 2015). It is crucial that coastal managers are aware of the limitations and uncertainties present in the reported results. This is especially the case for model scenarios investigating future conditions, which may be outside the model calibration time and space.

Model uncertainty is a very broad term and is often used without referring to the nature or source of the uncertainty that is being dealt with. In this paper, we use the typology of Refsgaard et al. (2007) and Walker et al. (2003), focusing on uncertainty in the parameter space and the influence of this uncertainty on the model application. The approach taken is based on the GLUE (generalised likelihood uncertainty estimation) methodology by Beven and Binley (1992), which is often applied in hydrological studies. The rationale for this method is that there are several combinations of parameter values (once the model has been calibrated) that can capture observations equally well, therefore being equifinal. Those not in favour of process-based modelling approaches believe that equifinality reduces the applicability of models (e.g. Oreskes et al. 1994) and that these equifinal settings may hypothetically respond very differently outside the calibration space. This may occur when numerical models are used for estuarine sediment management to predict the response of an estuary to e.g. deepening or different dredging strategies. Such measures may have such a large consequence that the equifinal parameter settings are no longer relevant.

This is of great importance for the Ems Estuary, a heavily impacted estuary located on the Dutch-German border. A 3D numerical model was developed (van Maren et al. 2015a) to explore measures to reduce the suspended sediment concentration (SSC) throughout the estuary and thereby to improve the ecological status of the estuary. These measures may include a reduction of the depth of tidal channels, modified dredging and disposal strategies, or enlargement of intertidal areas. However, because of uncertainty in the exact value for critical parameters, the calibrated model may describe the present-day sediment dynamics with sufficient accuracy but may not be adequate to predict the future response of the suspended sediment concentrations to interventions. This is known as epistemic uncertainty (i.e. uncertainty resulting from parameter values or physical processes). More research and data gathering could reduce the bounds of those parameter uncertainties, but since field and laboratory analyses are often insufficient to determine the exact parameter value, input parameter uncertainty is often investigated through sensitivity analyses. Although several shortcomings were identified in the Ems Estuary model by van Maren et al. (2015a), improving the modelled physics requires development of fundamental process knowledge (mainly related to exchange of sediment between the Ems Estuary and the hyperturbid Ems River), which is not feasible on the short term. The exchange of sediment between the Ems Estuary and the lower Ems River is underestimated because the model is in morphostatic equilibrium, resulting in no net sedimentation. With future advances in understanding, these processes may be added or improved, and new upgraded calibration sets can be developed to reassess the impact of future scenarios. This is part of future work, however, and we therefore focus here on the parameter settings.

This paper aims at examining the prognostic accuracy of a suspended sediment transport model, by acknowledging the uncertainty in parameter space through equifinality and by testing the effect of equifinal parameter sets in scenarios for which the model was not calibrated. In section 2, we introduce the model formulations, elaborate in greater detail on equifinality, and introduce the methodologies to quantify model accuracy. The results of the multiple calibration procedure are given in section 3 and the results of the scenario studies are presented in section 4. A discussion follows in section 5.

2 Modelling approach

2.1 Model formulations

The multi-fraction sediment transport model of the Ems Estuary used in this study was introduced in van Maren et al. (2015a) applying transport formulations developed for fine-grained, cohesive sediment by van Kessel et al. (2011). This sediment transport model distinguishes two bed layers: an upper layer (S1) which rapidly accumulates and erodes, and a deeper layer (S2) in which sediment accumulates gradually and from which it is only eroded during energetic conditions (spring tides or storms). This S2 layer represents a sandy layer in which fine sediment accumulates during calm conditions. When the bed shear stress exceeds a critical value cr, the sandy layer becomes mobile, and fine sediment that previously infiltrated into this layer is slowly released. Sand transport itself is not modelled, and therefore the thickness of this sand layer is constant and user defined. Most fine-grained sediment is stored (buffered) in this S2 layer. S1 represents the typically thin fluff layer consisting of mud, which rapidly erodes. The erosion rate E1, of layer S1, depends linearly on the amount of available sediment below a user-defined threshold M0/M1. M0 is the standard zero-order erosion parameter (kg/m2/s), whereas M1 (1/s) is the erosion parameter for limited sediment availability:

$$ {E}_1=m{M}_1\left(\frac{\tau }{\tau_{cr,1}}-1\right)\kern0.5em m<\frac{M_0}{M_1} $$
(1)
$$ \begin{array}{ll}{E}_1={M}_0\left(\frac{\tau }{\tau_{cr,1}}-1\right)\hfill & m>\frac{M_0}{M_1}\hfill \end{array} $$
(2)

Here, m is the mass of sediment in layer S1 (in kg/m2). This has the important consequence that in dynamic environments the equilibrium sediment mass on the bed is non-zero, contrary to standard Krone-Partheniades (KP) models (in which the sediment mass is zero or continuously increasing). Typically, this results in smoother and more realistic model behaviour in mixed sand-mud environments (m < M 0/M 1). For completely muddy areas (m > M 0/M 1), the buffer model switches to standard KP formulations for erosion of bed layer S1.

The erosion (E 2) of S2 scales with the excess shear stress to the power 1.5, in line with empirical sand transport pick-up functions, assuming that fines trapped within the sandy bed are released when sand is mobilised:

$$ {E}_2={p}_2{M}_2{\left(\frac{\tau }{\tau_{cr,2}}-1\right)}^{1.5} $$
(3)

Here, p 2 is the mass of fines in S2 (computed by the model) and M 2 is the resuspension parameter for S2 (kg/m2/s). Sediment can be eroded from layer S2 simultaneously with layer S1 (so S1 does not need to deplete before layer 2 can erode). The mass of sediment can be converted to volume with the dry bed density (assumed to be 500 kg/m3). This is relevant for the maximum amount of sediment that can accumulate in layer S2, which has a finite volume determined by a user-specified thickness of S2 (0.1 m in this application). The volume of sediment is irrelevant for layer S1 (which deposits on the bed); the feedback between bed changes and hydrodynamics is not accounted for (for computational efficiency, see the section hereafter).

The deposition flux D is the settling velocity w s times the near-bed sediment concentration C:

$$ D={w}_{\mathrm{s}}C $$
(4)

The deposition flux D (kg/m2/s) is computed without a critical shear stress for deposition, so deposition is modelled independently of the bed shear (Sanford and Halka 1993; Winterwerp 2007). All sediments initially deposit in the surface layer S1. Transport of sediment from S1 to the deeper layer S2 (F burial) depends on the amount of available sediment m and a user-defined diffusion parameter or burial rate k b:

$$ {F}_{\mathrm{b}\mathrm{urial}}={k}_{\mathrm{b}}m $$
(5)

We use two sediment fractions, IM1 with a large settling velocity (representing large flocs) and IM2 with a smaller settling velocity (representing finer or poorly flocculated sediment).

2.2 Model application

The Ems Estuary model is set up with eight vertical layers using the Delft3D software (Lesser et al. 2004), using a standard k-ε model for vertical mixing. The mesh size of the horizontal curvilinear grid decreases from ∼1000 m near the offshore boundaries down to ∼80 m inside the Ems Estuary. The model bathymetry is based on surveys by the Dutch Ministry of Public Works in 2005 (Fig. 1). The model is forced at the seaward boundaries by tide- and storm-induced water levels and by river inflow through various discharge points. Waves are simulated with SWAN (Booij et al. 1999), with wave heights and periods prescribed at all three open boundaries based on observations at a nearby wave buoy (SON, see Fig. 1), and with space- and time-varying wind fields to compute additional wave generation throughout the model domain. Wave-current interaction is not accounted for, but the combined effect of waves and currents on sediment resuspension is computed using the Soulsby et al. (1993) parameterisation of the maximum bed shear stress by Fredsoe (1984). Time-varying salinity is prescribed at the open boundaries, based on nearby observation stations. See van Maren et al. (2015a) for a more detailed description of the hydrodynamic model setup and calibration.

Fig. 1
figure 1

Top left: general location of the study area. Top right: map of the Ems Estuary and model domain with the ports of Emden, Delfzijl and Eemshaven and location of monitoring stations for model boundary conditions (BC1 and BC2, in green). Lower panel: more detailed map with SSC observation stations (yellow dots) and combined flow velocity—SSC observation stations (red dots). Only the bed level between −2 and 14 m is shown to highlight the difference in tidal flats and channels, but the channels and offshore sea may be up to 30 m deep

The hydrodynamic model is run for 1 year (2012), and the computed hydrodynamics are used to drive the sediment transport model. The sediment transport model is run for multiple years using the same hydrodynamic model output in cyclic mode. The reason for this is that the hydrodynamic model is the most computationally demanding, whereas the sediment transport model requires many years to achieve dynamic equilibrium. A hydrodynamic model run requires about 10 days (on a fast PC) to simulate a year, whereas the sediment model only requires 1 day for 1 year. During each cyclic run, the boundary conditions and modelled processes are identical to the previous run, except for the initial mud content (in the bed and water column). The sediment model is initialized without bed sediment: all sediment enters through the model’s open boundaries, and the model is run until the sediment on the bed is in dynamic equilibrium with the hydrodynamics. The period to achieve such a dynamic equilibrium in most estuarine settings with the applied model is typically 5 to 10 years: here, we use a spin-up time of 10 years to achieve the baseline calibration setting.

Besides modelling estuarine sediment transport, an integral function of the model is the dredging and disposal routine. Sediment deposition in the ports and fairways is computed by the model. The grid cells inside the ports are too large (∼100 m) to accurately simulate horizontal exchange, and therefore only port siltation by tidal filling and salinity-driven vertical exchange flows is realistically represented. Sediment is frequently dredged from both bed layers in the port areas and released on the near-bed (layer 8) grid cells of disposal ground pertaining to that particular port. If there has not been any port sedimentation in the model, no sediment can be dredged. The numerical dredging routine thereby mimics the actual dredging dynamics as close as possible. Apart from being physically more realistic to apply such an integral dredging routine, port siltation also becomes an independent model output to validate the model with. Moreover, several scenarios for improving the sediment dynamics in the estuary focus on modifying the dredging strategies in the estuary: a realistic simulation of dredging and disposal quantities is therefore essential.

2.3 Equifinality approach

Sediment transport may be supply limited or erosion rate limited. Transport of non-cohesive sediment (sand) is usually erosion rate limited, where the suspended sediment concentration is limited by the transport capacity of the flow and not by the availability of sediment on the bed. Cohesive sediment is usually supply limited: the transport capacity of the flow is much larger than the amount of sediment available. It is not straightforward to differentiate erosion-rate limitations from supply limitations under large-scale estuarine conditions, although numerical models provide a means to do so (see e.g. van Maren et al. 2011).

Two main problems arise for complex three-dimensional modelling of fine sediment transport. Firstly, many complex three-dimensional sediment transport models are so computationally intensive that stochastic simulations are not feasible and therefore the amount of simulations that can be carried out with these models is limited. This introduces a paradox: the greater the complexity of a model, the greater the possibility of equifinal parameter sets but also the greater the constraints on identifying these multiple model calibrations. Such multiple calibration sets yield similar results within the model calibration domain (by definition, exemplified in Fig. 2) but may strongly diverge in prognostic mode (outside the model calibration domain): see also Fig. 2. The existence of equifinality is often considered as an argument not to use complex models, since apparently their predictive power is limited (Pilkey and Pilkey-Jarvis 2007; Cooper and Pilkey 2004; Oreskes 2000). However, equifinality may not necessarily increase the uncertainty of predictions, a hypothesis which will be explored in this paper.

Fig. 2
figure 2

Three least-square fits of the Eq. Y = aX b + c through randomly generated data points, with b = 0.5 (red line), b = 1 (blue line) and b = 2 (black line). The solid line represents the calibration domain, whereas the dashed line is outside the calibration domain

Secondly, even though model output (dependent variables such as sedimentation rates or suspended sediment concentration) can be compared fairly well with observations, this does not hold for the model input parameters. Even though the equations determining exchange between the bed and the water column (Eqs. 15) contain only a small amount of input parameters, uncertainty remains in their values. Progress has been made recently to relate these input parameters to soil parameters (Winterwerp et al. 2012), but the main independent variables (including multiple erosion parameters and sediment settling parameters) are often difficult, too time-consuming or costly to measure. The uncertainty in their value ranges from a factor 2 to 5 for settling velocity w s and critical shear stresses τcr,1 and τcr,2. The erosion parameters M 0, M 1 and M 2 or the burial rate k b cannot be properly measured, introducing an uncertainty of about an order of magnitude. As a consequence, the settling velocity w s and critical shear stresses τcr are estimated based on empirical values presented in the literature, whereas the various values for M are obtained through calibration. The relatively large uncertainty in input parameters in Eqs. 15 implies that equifinality (dictating that multiple model parameter settings exist which reproduce the calibrated variables to the same degree of accuracy) must exist.

The various input parameters are mutually dependent in a physical sense and a random distribution of their values (as common practice in stochastic models) will generate a large number of meaningless model results. Instead of a Monte Carlo simulation of parameter distributions, we generate multiple calibration sets through an iterative procedure which may be time-consuming but requires fewer simulations, and which ensures at the same time that the parameters are kept within a realistic range. This procedure consists of (1) a sensitivity analysis, (2) definition of multiple calibration sets based on the sensitivity analysis and (3) fine-tuning of the multiple calibration sets. We start with a baseline calibration (very similar to the model introduced by van Maren et al. 2015a) in which the main model parameters (w s, τcr,1, τcr,2, M 0, M 1, M 2 and k b) were individually varied as part of a sensitivity analysis (Fig. 3). Based on this sensitivity analysis, three alternative model settings (A1–A3) were developed, each influencing model behaviour on a different level:

  1. A1.

    On a first level, the dependence of erosion E on bed shear stress τ is evaluated by varying τcr but maintaining a similar erosion flux by compensating the change in τcr with a change in M. The sensitivity analyses revealed a stronger response of suspended sediment dynamics to settings of layer S2 than to settings of layer S1, and therefore only τcr,2 and M 2 are varied.

  2. A2.

    On a second level, the erosion flux E 2 is increased by lowering τcr,2 and increasing M 2, compensated by also increasing the deposition flux to layer S2 (realised by enlarging the diffusion parameter k b).

  3. A3.

    On the third level, the settling velocity is increased (leading to more up-estuary transport), compensated by a reduction M 2 (leading to lower up-estuary transport). Note that on a small scale, both parameter changes would have the same effect (less erosion, more deposition), but on the overall estuary, both model parameter changes have an opposite effect.

Fig. 3
figure 3

Sensitivity of the difference in the computed suspended sediment concentration Δc at station S5 to variability of the settling velocity w s (a), the erosion parameters M 1 and M 2 (b), the critical shear stress τcr,2 (c), and thickness of the lower layer S2 (d). Δc is defined as the difference between the baseline suspended sediment concentration and the alternative suspended sediment concentration, low-pass filtered with a 24-h running mean

The baseline calibration was run for 10 years to achieve dynamic equilibrium (represented by similar sediment concentrations and available bed sediment in consecutive years). The model alternatives A1–A3 (and the sensitivity analysis) were run for 5 years, starting with the initial bed sediment computed with the baseline calibration. After approximately 4 years, the sediment transport model achieves dynamic equilibrium and therefore 5 years is considered a sufficiently long period (requiring approximately 5 days computational time). The model alternatives generally needed one more calibration improvement to give satisfactorily data-model comparisons. The original model settings and alternatives 1–3 are subsequently applied to predict the effect of two interventions in the estuary: (1) reducing the main tidal channel to its original depth and (2) disposing all sediment dredged from the ports offshore instead of within the estuary, as is the current practice.

3 Equifinal parameter sets

3.1 Detailed model setup

The baseline model settings reproduce the spring-neap and intra-tidal variation of SSC in the estuary mouth, the seasonal variability in the along-estuary SSC gradient, spatial variation in mud distribution, and typical port siltation rates. An uncertainty analysis was performed on this baseline model calibration, in which a number of input parameters were systematically varied. These variations not only generate different average SSC levels but also variations throughout the year (exemplified in Fig. 3): some parameters have their largest impact during low-energy conditions (the erosion rates M 0 and M 1, and the thickness of S2 have the largest impact in summer; see Fig. 3b, d), whereas others mainly influence winter conditions (the settling velocity and critical shear stress; see Fig. 3a, c). Based on this sensitivity analysis, parameter settings with contrasting impact on SSC (average levels but also the interannual variations) were combined into three alternative model settings. A brief description of the baseline model settings and the three alternative settings follows hereafter.

In the baseline model alternatives 1 and 2, the settling velocities w s for the 2 sediment fractions IM1 and IM2 are 1.2 and 0.25 mm/s, respectively—see Table 1. The settling velocity of IM1 (1.2 mm/s) is based on maximum floc settling velocities of 1–2 mm/s observed in the Ems Estuary by van Leussen and Cornelisse (1996). The IM2 settling velocity corresponds to the minimum settling velocity observed by van Leussen and Cornelisse (1996). Sediment in the model domain enters through the open boundaries, where boundary values for IM1 and IM2 were based on observations at nearby stations BC1 and BC2 (see Fig. 1 for location), with half the observed SSC assigned to IM1 and the other half to IM2. Sediment with a larger settling velocity has a larger up-estuary transport capacity, and therefore the ratio of IM1:IM2 changes from 1:1 in the North Sea to 4:1 in the landward part of the estuary. IM1 is mainly responsible for the SSC signal with a strong intertidal variation, whereas IM2 acts as a wash load fraction needed to maintain a minimal sediment concentration around slack tide. For model alternative 3, the settling velocity is increased by 50 %.

Table 1 Model parameters varied for the multiple calibrations

Spatially uniform values for the critical shear stress for erosion τcr are prescribed for the S1 layer and the S2 layer. Sediment which does not, or only marginally, consolidate has a critical shear stress for erosion τcr of 0.01 to ∼0.1 Pa (e.g. Widdows 2007). Therefore, the critical shear stress for the fluff layer is very low (τcr = 0.05 Pa), implying that sediment in the top layer is easily resuspended. This parameter has a relatively minor effect on overall sediment dynamics. Sediment in S2 is assumed to erode during more energetic conditions only when the mud trapped in the sand layer is released. This occurs at larger shear stresses than the initiation of motion of sand particles; earlier studies (van Kessel et al. 2011) suggested a value around 1 Pa. In the baseline model, τcr,2 is set to 0.9 Pa and is increased (decreased) for alternative 1 (2). The higher τcr,2 in alternative 1 is compensated by a higher burial rate: sediment more rapidly deposits in the lower layer. Finally, the erosion parameters M 1 and M 2 (see Table 1) are adjusted to maintain a comparable tide-averaged resuspension rate (M 0 is kept constant) and provide the final calibration parameter to optimise modelled and observed suspended sediment concentration.

3.2 Qualitative model calibration

For SSC, the modelled sediment dynamics for each of the parameter sets are first compared qualitatively to available data (collected in the main tidal channels in 2012) to examine their equifinality. For stations S1–S6 (see Fig. 1 for locations), only snapshot observations of SSC are available (filtered water samples collected every 2 weeks near the water surface). These observations are typically within the intra-tidal variation of the computed SSC (with Fig. 4 as an example). Both the observed and computed SSC varies between ∼0.1 to ∼0.2 kg/m3, with larger values in winter (Fig. 4a). The discrepancies between observed and modelled SSC may be up to several 0.01 kg/m3, but the difference in SSC levels is much lower among the different model setups (Fig. 4b). Except for January, the average SSC levels of the various model realisations are within 0.05 kg/m3.

Fig. 4
figure 4

Time series of measured (red dots) SSC and the baseline model run (blue line), at station S5 (a). The model alternatives are plotted as deviations from the baseline run, low-pass filtered with a 24-h running mean (b)

The along-estuary distribution of the computed SSC is comparable for the baseline model and the parameter set alternatives (Fig. 5). The sediment concentration is lowest at the estuary mouth (station S1) and increases in the landward direction. The sediment concentration is highest during the winter months, because of larger wave-induced resuspension. The difference between the parameter set alternatives is smaller than the difference between the data and model results. The difference between the alternatives is largest during winter conditions; unfortunately, this is also the period that least data is available.

Fig. 5
figure 5

Monthly averaged observed surface sediment concentration (black dots, February through November) and model realisations in 2012 at stations S1 to S6 (in kg/m3). The colours refer to the model realisation, and the grey dashed line is the standard deviation of the baseline model run (computed for each month from hourly model output). See Fig. 1 for the location of stations and Table 1 for the main settings of the baseline model and the three alternatives

The only available continuous time series of SSC were collected at stations GSP2 and GSP5 (see Fig. 1 for location), collecting data from March to May 2012. Data was collected halfway in the water column and near the bed, using OBS’s collecting data at 10-min intervals. These data provide a means to compare the intra-tidal and spring-neap variation in computed SSC. The baseline model calibration very reasonably reproduces observations (Fig. 6), and the four parameter set alternatives are nearly identical (Fig. 6). Similar to the comparison with the snapshot observations, there is a smaller difference between the alternatives than between the model and the data. For all parameter set alternatives, a visual classification would label the comparison between observed and modelled SSC as ‘good’ (accounting for the complexities inherent to modelling fine-grained suspended sediment transport) and therefore they can be considered, at least qualitatively, equifinal.

Fig. 6
figure 6

Computed water level (a), observed (black) and computed (red) depth-averaged flow velocity (b), near-surface sediment concentration (4 m below the water surface (c)) and near-bed sediment concentration (d) with observed SSC in black and model realisations in colour (B is baseline, A1 to A3 is alternative 1 to alternative 3), at location GSP5, from 3 to 8 March 2012. See Fig. 1 for the location of stations and Table 1 for the main settings of the baseline model and the three alternatives

3.3 Skill assessment

A range of statistical techniques are commonly used to assess the skill of water quality models, applied by Stow et al. (2009), Joliff et al. (2009) and Friedrichs et al. (2009), which can also be applied to evaluate the accuracy of the modelled SSC. Here, we follow the target diagram methodology presented by Joliff et al. (2009) which are similar to Taylor diagrams developed earlier (Taylor 2001) but are more straightforward to interpret. Target diagrams are plots to quantify and visually summarise various aspects of model performance, relating the model bias B* on the ordinate to the unbiased root-mean-square-error RMSE* on the abscissa:

$$ B\ast =\frac{\overline{M}-\overline{D}}{\sigma_{\mathrm{D}}} $$
(6)
$$ \mathrm{RMSE}\ast =\frac{\mathrm{sign}\left({\sigma}_{\mathrm{M}}-{\sigma}_{\mathrm{D}}\right)}{\sigma_{\mathrm{D}}}\sqrt{\frac{{\displaystyle {\sum}_{n=1}^N}{\left[\left({M}_{\mathrm{n}}-\overline{M}\right)-\left({D}_{\mathrm{n}}-\overline{D}\right)\right]}^2}{N}} $$
(7)

With \( \overset{-}{M} \) and \( \overset{-}{D} \) as the time-averaged modelled and observed variable, σ M and σ D the standard deviation of the modelled and observed variable. The model bias B* is a measure for the mean reproduction of a particular variable whereas the RMSE* quantifies the difference between model and data variability. A positive B* implies the model mean is larger than the observed mean; a positive RMSE* means that the modelled variability is larger than the observed variability (and vice versa for negative values of B* and RMSE*). The RMSE* is related to the correlation coefficient R (Taylor 2001; Joliff et al. 2009), and therefore a correlation coefficient criterion can also be used in target diagrams (Los and Blaas 2010).

B* and RMSE* are computed for stations with snapshot observations collected near surface (S1–S6) and stations where continuous time series are measured at mid-depth (GSP1 and GSP2). The snapshot observations are compared to modelled sediment concentration at the same time period as the observations, and therefore both B* and RMSE* are based on 20 data-model comparisons. For GSP1 and GSP2, hourly observations with a 2-month duration are used.

The agreement at GSP5, S2, and S5 (Fig. 7) (with \( \sqrt{B_{*}^2+{\mathrm{RMSE}}_{*}^2} \) = 0.7 to 1) can be classified as ‘reasonable’ (Los and Blaas 2010). The average SSC level is good, with a model bias |B*| < 0.5 (except for station S1). The reasons for the large discrepancy between the observed values at S1 are not exactly known, but probably related to its position on the transition from turbid estuarine waters to clear North Sea waters. In the model, S1 is still part of the estuarine domain whereas in the observations it is already dominated by North Sea waters.

Fig. 7
figure 7

Target diagram for the eight observation stations, using the baseline model settings (circles) and the three alternatives (squares, triangles and stars for A1, A2, A3, resp). See Fig. 1 for station locations

The variation in SSC is less well reproduced, suggested by the relatively large RMSE*(Fig. 7). However, it is questionable to what extent a signal with a pronounced intra-tidal variation (such as SSC) can be compared with snapshot observations without such a tidal variation (S1–S6). Only Los and Blaas (2010) applied target diagrams for snapshot observations of SSC, but their area of interest was a coastal sea with a much smaller tidal variation in SSC. In general, target diagrams have mainly been used in coastal areas, where time variations and spatial variations are much smaller than in energetic estuaries such as the Ems Estuary. As a result, there is no proper benchmark available to interpret model performance, and it remains difficult to ascertain how ‘good’ the model reproduces the data. However, as concluded earlier from Figs. 4 and 6, the difference between individual model parameter sets is smaller than between the model and the observations.

3.4 Model evaluation

Despite discrepancies between the model and the data (which may also partly result from inaccuracies in the data—see van Maren et al. 2015a), both the qualitative and quantitative model assessment suggests that all model parameter alternatives describe the SSC data equally well: there is no ‘best’ model setting. Next, the model outcomes are compared in areas in which no data is available but where different processes may dominate sediment transport: the intertidal flats. On the tidal flats, consolidation of fine-grained sediment and biological effects may have a large impact on the sediment dynamics. As a result, both the dominant transport processes may be different compared to the tidal channels, but also optimal model settings may differ.

The baseline model does reproduce the spatial variation of fine sediment deposition, with muddy intertidal flats and sandy channels (in line with observations; see van Maren et al. 2015a). The effect of the model alternatives on the spatial distribution of SSC is evaluated by yearly averaging the model results. Although the yearly averaged difference in SSC is very small in the channels (where data was available), the differences become larger on the tidal flats (Fig. 8). Alternative 1 (Fig. 8b) is less dynamic during low-energy conditions but more dynamic during high-energy conditions. Averaged over the year, the result is higher SSC levels on all tidal flats. Increasing erosion from the lower layer, but also the diffusion between the lower and upper bed layers (alternative 2, Fig. 8c), has contrasting effects on the exposed and sheltered tidal flats. The computed sediment concentration over the exposed flats (in the outer estuary) increases slightly whereas that over the sheltered flats (deeper in the estuary) becomes lower. The model alternative with a larger settling velocity (alternative 3, Fig. 8d) mainly reduces SSC on the energetic flats. Only alternative 2 (Fig. 9c) generates a reduction in the mud content. In general, the mud distribution within the sediment bed is only marginally influenced for all parameter alternatives (Fig. 9). There is insufficient information available on the mud content (see van Maren 2015a) to discern which of the model setups provide a better approximation of the actual mud content.

Fig. 8
figure 8

Average near-surface sediment concentration computed with the baseline run (a) (in kg/m3) and difference in SSC (in kg/m3) for alternatives A1–A3 (bd, resp). The difference is defined as baseline minus alternative

Fig. 9
figure 9

Spatial distribution of the yearly averaged amount of sediment in the top 10 cm of the bed (layer S1+S2), for the baseline model (a) and alternatives A1–A3 (bd, resp)

Lastly, the observed siltation in the ports of Emden, Delfzijl and Eemshaven are compared for the different model parameter sets (Table 2). For Eemshaven and for Delfzijl, the agreement between observed and modelled siltation is very good (typically within 10 % of observations) for all model alternatives. Siltation in the port of Emden is strongly underestimated, because of complex sediment exchange mechanisms between the Ems Estuary and the lower Ems River (see van Maren et al. 2015a). Alternative 3 (larger settling velocity) slightly improves siltation rates here. Although siltation in Emden Port is still underestimated, the error is less than for the baseline model.

Table 2 Estimated (from dredging volumes) and computed (numerical model) deposition rates (in 106 ton/year) in the ports of Eemshaven, Delfzijl and the port and fairway of Emden, for the baseline model and the three alternatives. The observed dredging volumes have been converted to mass using a dry density of 500 kg/m3 (which is based on data collected by Mulder 2013)

4 Management scenarios: the effect of equifinality

The Ems Estuary has become more turbid in the past decades (de Jonge et al. 2014; van Maren et al. 2015a). In the lower Ems River (a tidal river that drains into the Ems Estuary), channel deepening has increased the SSC to the extent that the estuary has become hyperturbid (Winterwerp et al. 2013; van Maren et al. 2015b). The mechanisms responsible for the increase in SSC in the outer estuary are less straightforward. Although the magnitude of the increase is much smaller than in the lower Ems River, the increase may have negative ecological impacts due to the resulting loss in light attenuation. Over the past centuries, large reclamations of intertidal areas prevented settling of fine-grained sediment and therefore led to increasing SSC levels (van Maren et al. 2016). In the past decades, no longer extracting sediment at the head of the estuary (which used to be a common practice until ∼1990, reducing the along-estuary concentration gradient) has increased SSC (van Maren 2015a, 2016). This was strengthened by simultaneous deepening of the tidal channels (through enhanced salinity-driven estuarine circulation, transporting sediment up-estuary)—see van Maren (2015a). Therefore, two scenarios have been selected for this study (see Fig. 10): reducing the depth of the main tidal channel (scenario A) and disposing all sediment dredged from the ports and access channels offshore and not within the estuary itself (scenario B).

Fig. 10
figure 10

Visualisation of evaluated model scenarios: disposal of dredged sediment at sea (red dot) instead of the present-day disposal areas (green) and channel restoration (orange to red shades) of the tidal channel. Bed level contours drawn at 5-m intervals

The original model configuration (existing disposal site, present-day bathymetry, baseline settings) is the reference model. This reference model and scenarios A and B are run with the settings of the baseline model and the equifinal parameter alternatives 1–3 (total 12 simulations). The impact of the scenarios are computed for the baseline settings and equifinal alternatives 1–3 individually by subtracting the scenario model results from the reference model result.

Raising the bed level in the tidal channel reduces SSC levels on the tidal flats in the upper parts of the estuary (Fig. 11a–d) but increases SSC in the outer estuary. This redistribution of SSC reflects the smaller up-estuary sediment transport, resulting from weakening of gravitational circulation (van Maren et al. 2015a). This gravitational circulation may result from classical salinity-driven estuarine circulation (as suggested by van Maren et al. 2015a) or from tidal asymmetries in vertical mixing (Pein et al. 2014). For both mechanisms, gravitational circulation generates an up-estuary flow near the bed and a down-estuary flow near the surface. Due to the typical increase of SSC from the water surface towards the bed, gravitational circulation drives an up-estuary sediment transport. The magnitude of gravitational circulation increases non-linearly with depth, and therefore a shallower channel results in less up-estuary transport. The effect of the channel depth on SSC changes only marginally for the equifinal parameter sets: the baseline model results (Fig. 11a) differ negligibly from alternatives 1–3 (Fig. 11b, c).

Fig. 11
figure 11

Increase in near-surface SSC (in kg/m3) as a result of shallowing (scenario A, top panels) and offshore disposal (scenario B, lower panels), for the model baseline run (a, e), alternative 1 (b, f), alternative 2 (c, g) and alternative 3 (d, h). Each SSC difference is defined as the yearly averaged SSC of a particular scenario minus the yearly averaged SSC of the 2012 model, for the same model parameter settings

Disposing sediment offshore leads to substantially lower SSC in the majority of the estuary (Fig. 11e–h). This decrease corresponds to earlier results (van Maren et al. 2015a) in which stopping large-scale sediment extraction was concluded to be an important driver for the observed SSC in the estuary. Similar to the impact of channel depth, the computed effect of disposal locations is not substantially influenced by the equifinal parameter sets.

The effect of equifinality on scenario alternatives is further explored using boxplots generated for locations S1–S6 (Fig. 12). For all stations, the mean SSC values vary only slightly between the baseline and the alternative settings (which is expected as the model alternatives were calibrated to give similar results at these stations). As already revealed in Fig. 11, the offshore disposal leads to lower SSC levels, but the boxplot additionally indicates the statistical significance. For offshore disposal, the 75 % standard deviation (indicated with the box) at locations S3–S6 is below the mean of the reference model, whereas the whisker (99.3 %) is only slightly above the mean. Using the equifinal parameter settings has a minor impact on the impact of seaward disposal. Likewise, the restoration of the channel to its original depth does not significantly influence the SSC levels at locations S1–S6. This observation is in agreement with Fig. 11a–d, which revealed changes in SSC on the intertidal flats, but only marginally within the tidal channel.

Fig. 12
figure 12

Boxplot of the average sediment concentration (circle) the 75 % (filled box) and 99.3 % (whisker) standard deviation, for the reference model and two scenarios, for the baseline model settings and three scenarios, at locations S1–S6 (af, resp)

5 Discussion

Model uncertainty may arise from a lack of knowledge (epistemic uncertainty) and from variability inherent to the physical system under consideration (Walker et al. 2003). Epistemic uncertainty results in an imperfect description of physical processes and an inability to accurately quantify model parameter settings. The inherent variability of the system may stem from meteorological and climate events or changing boundary conditions due to events outside the domain of the system being modelled. Contrasting with this variability uncertainty (with its inherent stochastic nature), epistemic uncertainty may be reduced through additional model development or field observations. In this paper, we focus on one part of the epistemic uncertainty: model parameter settings. Even though the uncertainty in process formulations may have a similar or even larger impact on model outcome, we have focused here only on uncertainty in parameter settings.

The sensitivity of model outcome to uncertainty in model input parameters can be identified through stochastic or probabilistic models. This has become a common practice in hydrological models but is also used in one- or two-dimensional morphological models (van der Klis 2003; van Vuren 2005; van der Wegen and Jaffe 2013). Stochastic approaches are also used in weather prediction models, with the most important differences between the various global weather forecasting models mainly being the way they define variability in parameter settings and initial conditions (Parker 2010). A major difference however between transport models and weather models is that weather prediction models are very sensitive to initial conditions, with small perturbations in the initial conditions having large impacts on the forecast (Lorentz 1963).

The basis of stochastical modelling is that a certain likelihood distribution is assumed for the independent variables, of which the effect is evaluated through Monte Carlo simulations. Such an approach is less suitable for complex three-dimensional fine sediment transport models for two reasons. Firstly, many complex three-dimensional sediment transport models are so computational intensive, that stochastic simulations are not feasible. Secondly, even though model output (dependent variables such as sedimentation rates or suspended sediment concentration) can be compared fairly well with observations, this does not hold for the model input. Moreover, the various input parameters are mutually dependent in a physical sense and a random distribution of their values will generate a large number of meaningless model results. The equifinality thesis adopted in this paper provides a feasible and physically realistic modelling approach.

Equifinality means that many different parameter settings may be acceptable in reproducing observed behaviour using process-based numerical models (Beven and Freer 2001) and has been demonstrated for rainfall-runoff models, flood frequency and inundation models, river dispersion models, soil vegetation-atmosphere models, groundwater flow and transport models, and soil geochemical models (see Beven and Freer (2001) for detailed references). To our knowledge, the testing of equifinality in parameter space has not been applied previously to sediment transport modelling. Because of constraints imposed by the computational time of our three-dimensional numerical model, we have approached the concept of equifinality in a mathematically less complex way by defining a limited number of multiple model calibration sets. This is a less quantitative definition of model equifinality than that by Beven (1993) but a more quantitative approach than the sensitivity studies commonly applied in sediment transport studies. The essential difference of our work compared to a normal sensitivity study is that the latter investigates the sensitivity of the model to parameter settings but does not strive for model settings which are as accurate as the original model in reproducing observations. As a result, a sensitivity study is not sufficient to evaluate the model accuracy with respect to past or future impacts in an estuary. This work therefore provides a first endeavour in identifying the effect of equifinal parameter sets on predicted changes in SSC and siltation rates due to interventions.

As a result of both epistemic and variability uncertainty, models that realistically reproduce present-day system dynamics may not be able to accurately provide forecasts (Oreskes et al. 1994). But despite all the uncertainties associated with input data, model formulations, calibration and validation, there is still the need to make management decisions in areas affected by water quality concerns, rising sea levels and morphological change. A model can (1) provide insight into system behaviour, even with uncertainties in input data and parameter space and (2) contribute new understanding on process behaviour and process interaction. As such, they can support or challenge a theory by providing evidence to strengthen or refute respectively (Oreskes et al. 1994) what may already be partially established through field monitoring or historical analyses.

The main contribution of the present work to the wide range of literature on model uncertainty is threefold. First, this work demonstrates that the equifinality concept can be demonstrated in three-dimensional sediment transport models, albeit in a mathematically more simplified way. Secondly, it provides a means to quantify the accuracy of model scenario predictions. The two scenarios investigated in this study were only marginally affected by parameter settings, demonstrating that the uncertainty associated with model input parameters settings has no substantial effect on the predictive capacity of the model. Thirdly, our analyses reveal areas where more field data is needed. Even though the model is equifinal in areas where data is available (the tidal channels), the computed SSC on the tidal flats does depend on model settings. The degrees of freedom of the model input parameter can be narrowed down by doing more detailed field observations on the tidal flats. Additionally, the two intervention scenarios executed in this paper mainly involved channel dynamics, and it is likely that a scenario involving tidal flat dynamics will be influenced differently by model parameter sets.

The methodology provided in this paper is a first step in quantifying the effect of model uncertainties in parameter space on model scenario studies, needed for estuarine sediment management. Future studies should be extended with examination of the uncertainty resulting from process formulations, in addition to varying calibration parameters.

6 Conclusions

Estuarine suspended sediment transport models are frequently calibrated against observations collected in tidal channels. Such observational data can be reproduced equally well with a range of model input parameters demonstrating equifinality. This range of model parameters influences the importance of individual sediment transport processes, by e.g. strengthening burial in the seabed, settling from suspension or erosion. Different combinations of model input parameters leading to the same result (equifinality) may weaken the predictive capacity of the model. However, this work showed the predictive capacity of the model to simulate the effect of human interventions on the estuarine sediment dynamics is not critically influenced by equifinality. Two scenarios executed with the model (offshore disposal of dredged sediment and restoration of the tidal channel depth) were only marginally influenced by the model calibration settings. This strengthens the confidence in the numerical model predictions: the modelled response to interventions seems only limitedly affected by numerical model settings.

However, future model scenarios in which the sediment dynamics over the intertidal flats are more important will probably be much more influenced by model parameter settings. This is made evident by the sensitivity of SSC over the tidal flats, for the different model settings. Even though the sediment concentration in the channels was comparable for the various model calibration sets, the sediment concentration on the tidal flats was notably different. This stresses the importance of collecting a wide range of observational data, varying from SSC measurements in the channel to the intertidal flats.