A benchmark for variably saturated variable-density variable-viscosity flow and solute transport in porous media

In natural environments, fluid density and viscosity can be affected by spatial and temporal variations of solute concentration, for example, due to saltwater intrusion in coastal aquifers, leachate infiltration from waste disposal sites, and upconing of saline water from deep aquifers. Potentially unstable situations may arise in which a dense fluid overlies a less dense fluid. This situation can produce instabilities manifested by dense plume fingers moving vertically downwards counterbalanced by vertical upward flow of the less dense fluid. The resulting free convection increases solute transport rates over large distances and times relative to constant-density flow. Unstable brine flow is further complicated if the porous medium is variably saturated. The results from a laboratory experiment of variably saturated variable-density flow and solute transport from Simmons et al. (2002) are used as the physical basis to define a new mathematical benchmark. This benchmark aims at realistically reproducing the experimental fingering patterns. Random hydraulic conductivity fields were used in the simulations as a numerical perturbation method to realistically mimic the observed dense plume fingering. The HydroGeoSphere code coupled with PEST are used to calibrate the parameter set that defines the benchmark. A grid convergence analysis is performed to obtain the adequate spatial and temporal discretizations. The new mathematical benchmark is useful for model comparison and testing of variably saturated variable-density flow in porous media. Simmons CT, Pierini ML, Hutson JL (2002) Laboratory investigation of variable-density flow and solute transport in unsaturated–saturated porous media. Transp Porous Media. 47(2): 215–244, 10.1023/A:1015568724369.


Introduction
In hydrogeological systems, variable-density flow and solute transport phenomena are relevant in seawater intrusion, saltwater upconing in coastal aquifers, dense contaminant plume migration, and dense non-aqueous phase liquid (DNAPL) studies, in both the phreatic and the vadose zone (Simmons 2005).A potentially unstable situation may develop, in which a dense fluid overlies a lighter fluid.That unstable situation can lead to sinking of dense plume fingers, counterbalanced by vertical upward flow of freshwater (Simmons 2005;Simmons et al. 2002).The resulting free convection is relevant since it increases the solute transport rate and mixing over large distances and times compared to constant-density flow Simmons 2005.The importance of variable-density flow has been reported in review papers from Diersch and Kolditz (2002); Simmons (2005); Simmons et al. (2001).Recently, Simmons et al. (2010) gave an extensive evaluation of advances in the field of variable-density flow and transport, in which physics, modelling approaches and future challenges are given.
Numerical modeling is a useful method to investigate variable-density flow and transport.For numerical computer models to be applicable, they have to be tested against benchmarks.Benchmarks are model problems that are based on physical models, for which qualified laboratory data are available (Diersch and Kolditz 2002).Benchmarks should represent the mathematical and physical character of the underlying physical variable-density flow problem that normally has a simplified geometry so that comparable numerical solutions of accepted quality are obtainable (Diersch and Kolditz 2002).Henry (1964) presented a numerical benchmark for saltwater intrusion in coastal aquifers.The domain consisted of a vertically oriented rectangle with a constant head and constant saltwater concentration at the seaside boundary, constant flux and constant freshwater concentration at the landside inflow, and impermeable top and bottom boundaries.This setup leads to the formation of a salt wedge that is located below the freshwater and diminishes towards the landside boundary.This problem has been widely used, modified and improved by various authors (e.g.Pinder and Cooper 1970;Simpson andClement 2003, 2004).Elder (1967) experimentally investigated free thermal convection in an aqueous medium.Physical experiments were conducted using a rectangular Hele-Shaw cell (with dimensions 20 cm × 5 cm × 0.6 cm) heated from below.The induced temperature gradient lead to density differences in the fluid, which triggered thermal convective patterns.Hele-Shaw (1898) showed that in a cavity where the width is much smaller than the length or height of the domain, the convective motion is similar to that in porous media (Elder 1967).Furthermore, Elder (1967) numerically benchmarked the experiments using a finite difference representation of the dimensionless fluid mass conservation and heat transfer equations which were simplified using the Oberbeck-Boussinesq (OB) approximation (Boussinesq 1903;Oberbeck 1879).Voss and Souza (1987) defined an upscaled solute analogue of the thermal Elder problem which has the dimensions of 600 m × 150 m with a unit thickness of 1. Constant concentrations leading to fluid densities of 1,200 and 1,000 kg m −3 were assigned to the top middle and bottom boundaries, respectively, leading to free solutal convection.The solute analogue was numerically simulated using the finite element code SUTRA (Voss and Provost 2010), which uses the fluid and solute mass conservation equations at the second non-OB level (Guevara Morel et al. 2015).The resulting dimensionless concentration isochlors were compared to the dimensionless isotherms from Elder (1967), and results were found to be in good agreement.An important feature of the solute analogue Elder problem is that a very high diffusivity of 10 −6 m 2 s −1 was used in order to mimic thermal diffusion of the original thermal Elder problem (Voss and Souza 1987).Schincariol and Schwartz (1990) carried out experiments using a fully saturated flow tank (1,168 mm × 710 × 50 mm) filled with homogeneous and heterogeneous porous media to investigate variable-density flow and mixing.NaCl solutions (ranging from 999 to 1,066.2 kg m −3 ) were injected from the left side of the flow tank that initially contained a fluid with a density of 998.2 kg m −3 .The experimental results showed that, using a homogeneous medium, density differences as small as 0.0008 kg m −3 trigger instabilities that lead to unstable convective patterns.For the heterogeneous medium, complex and highly dispersed plumes were observed.Schincariol et al. (1994) numerically benchmarked the physical experiments using the finite element code VapourT (Mendoza 1992), which uses the fluid and solute mass conservation equations at the OB level (Guevara Morel et al. 2015).Schincariol et al. (1994) suggested the use of perturbations in order to adequately represent the instabilities observed in the physical experiments.Wooding et al. (1997a, b) experimentally examined fully saturated solutal convection in groundwater under an evaporating salt lake using a vertically oriented rectangular Hele-Shaw cell (150 mm × 75 mm × 2 mm).Lateral and bottom boundaries were impermeable.Two thirds of the top boundary were also impermeable, and the remaining third was exposed to evaporation.Evaporation allowed the accumulation of brine in the top layer, which triggered solutal convection.Development and movement of dense plume fingers were monitored.Wooding et al. (1997a, b) numerically simulated the physical experiments using a finite difference approach and the OB approximation (Guevara Morel et al. 2015) for the fluid and mass conservation equations.Simmons et al. (1999) numerically benchmarked the salt lake problem using SUTRA (Voss and Provost 2010), where perturbations in the form of white noise were applied to the specified concentration values along the top boundary to trigger fingering.Results indicated that numerical experiments that were not perturbed did not show the same fingering pattern as the physical experiments.Oswald (1998) conducted fully saturated physical experiments to investigate the upconing of a saltwater-freshwater interface due to pumping.A complete description of the saltpool problem (Oswald 1998) setup can be found in Oswald and Kinzelbach (2004).The domain consists of a cubic box with an edge size of 20 cm that was filled with a homogeneous and isotropic porous medium (glass beads).Initially, a saltwater layer of 6 cm thickness was placed below a freshwater layer.Inflow of freshwater and outflow of mixed water through opposite top corners lead to the upconing of saltwater.Concentrations at the outlet were monitored, which allowed the definition of a breakthrough curve at the outflow hole.The influence of two density contrasts-maximum densities: 1,005.9kg m −3 (case 1) and 1,072.2kg m −3 (case 2)-on upconing was tested.Johannsen et al. (2002) numerically benchmarked the saltpool problem using the d 3 f code (Fein 1998), which uses the mass conservation equations at the second non-OB level (Guevara Morel et al. 2015).Results indicated that case 1 is not affected by transversal dispersion, and a coarse spatial resolution is sufficient to capture advection and longitudinal dispersion effects.For case 2, a very fine spatial resolution is needed to adequately represent transversal dispersion effects.The numerically simulated breakthrough curves were in good agreement with the physical experiments.Thorenz et al. (2002) physically and numerically investigated saltwater movement in variably saturated porous media.Physical experiments were conducted in a vertically oriented flow cell (865 mm × 478 mm × 105 mm) filled with sand.The flow cell was equipped with 16 equally spaced tubes placed horizontally throughout the system to allow tracer injection and sample extraction.Results showed that variable-density flow effects and significant lateral flow are present in the unsaturated zone and in the capillary zone.The numerical simulations were carried out with the finite element code ROCKFLOW (Kolditz et al. 1995;Kröhn 1991;Ratke 1995), which uses the second non-OB level (Guevara Morel et al. 2015) of the fluid and solute mass conservation equations.The numerical model realistically reproduces the flow behavior observed in the physical experiment.Stoeckl and Houben (2012) experimentally studied flow dynamics and the age stratification in a freshwater lens.The physical experiments were done in a rectangular acrylic box with dimensions of 2 m × 0.5 m × 0.05 m filled with homogeneous sand under fully saturated conditions.The ambient saltwater present in the box had a density of 1,021 kg m -3 .Recharge by freshwater with a density of 997 kg m −3 was simulated using 15 drips located at the top center part of the model.Tracers were used for the visualization of the age stratification in the freshwater lens.The experiment showed that flow paths in the freshwater lens are always in contact with the surface.Stoeckl et al. (2016) defined the numerical benchmark of the physical experiments.Taking advantage of the symmetry of the problem, Stoeckl et al. (2016) used a reduced domain size (0.9 m × 0.3 m × 0.05 m).The numerical simulations are carried out using the finite element code FEFLOW (Diersch 2002), where the fluid and mass conservation equations are solved at the first non-OB level (Guevara Morel et al. 2015).Results indicated a good agreement between experimental solutions and numerical results.Simmons et al. (2002) experimentally investigated variable-density free convection in variably saturated porous media.The experiments were carried out in a flow tank with inner dimensions of 1.178 m × 1.06 m × 0.053 m filled with homogeneous sand.Calcium chloride (CaCl 2 ) solutions with densities of 1,001.5 kg m −3 (low), 1,048 kg m −3 (medium), and 1,235 kg m −3 (high) were infiltrated from above into the sand tank.The infiltration of CaCl 2 solutions was done via a PVC box located at the center top of the domain connected to a constant-head reservoir.Volumetric losses from the reservoir were recorded in time which allowed an infiltration rate calculation.Plume penetration height over time was documented for different density and saturation cases, and photographs of the convective patterns were taken at selected times.Simmons et al. (2002) concluded that the onset of fingering depends not only on fluid density gradient and porous medium permeability, but also on water saturation.For the low-density case, the plume spreads laterally when reaching the water table, and no significant fingering was observed.For the medium-and high-density cases, the plume reached the water table and saturated plume fingers formed.Simmons et al. (2002) remarked on the importance of the unsaturated zone and the water-table location in order to predict the ultimate fate of a dense plume.Cremer and Graf (2015) used the high-density case from Simmons et al. (2002) under variably saturated conditions to numerically reproduce the physical experiments.Cremer and Graf (2015) used the HydroGeoSphere model (Therrien et al. 2004) with the mass conservation equations at the OB level.The aim of their study was to determine adequate numerical perturbation mechanisms (e.g.random hydraulic conductivity fields, solute source perturbations) capable of realistically reproducing the fingering patterns seen in the physical experiments.Results indicated that nonperturbed simulations showed no realistic finger formation.Random hydraulic conductivity fields and spatially random timeconstant perturbations of the solute source were found to appropriately generate plume fingers.While the focus of Cremer and Graf (2015) was on the efficiency of perturbation mechanisms, the laboratory experiments by Simmons et al. (2002) have not been mathematically benchmarked in Cremer and Graf (2015).Knorr et al. (2016) conducted experimental and numerical studies, which aimed at determining the representativeness of 2D modeling when simulating 3D variable-density flow.The experimental setup consisted of a sand column with a diameter of 0.48 m and a height of 0.508 m.The sand column experiments were conducted using Bromide solutions with different densities (996.4 and 992.9 kg m −3 ) injected at three different velocities.The experiments showed that the impact of variable-density flow increases with decreasing density and decreasing flow rates.The experiment was numerically reproduced using FEFLOW (Diersch 2002) using both a two-dimensional (2D) axisymmetric and a three-dimensional (3D) representation.Both models were assessed by comparing numerically generated finger patterns, concentration breakthrough curves and mass recovery to the experimental results.Belfort et al. (2017) used a noninvasive imaging technique to obtain water contents in unsaturated porous media.An experiment in a 2D flow tank, filled with monodisperse quartz sand, and with inner dimensions of 0.40 m × 0.14 m × 0.06 m was conducted.The sand tank was initially saturated and afterwards systematically drained leading to an unsaturated zone and the change of position of the water table with respect to the saturated state.The experiment was numerically reproduced using the methodology described by Belfort et al. (2013) and Fahs et al. (2009).Experimental and numerical results are overall in good agreement; furthermore, the sensitivity of the numerical results to the chosen hydraulic parameters is highlighted.Geng et al. (2020a) investigated the influence of swash motions on subsurface flow and moisture dynamics in coastal aquifers using a density-dependent, variably saturated groundwater flow model.Results from a homogeneous system (Geng et al. 2017) were compared with systems in which a heterogeneous distribution of hydraulic conductivity and capillarity were used.Results show that capillary heterogeneity considerably increases the spatial variability of water content, which impacts hydraulic conductivity.This induces the formation of a so-called capillary barrier which limits flow from low-permeability to high-permeability zones.Consequently, solute transport is also affected by water content variations.Geng et al. (2020a), (b) emphasize the importance of geologic heterogeneity in subsurface flow dynamics.Younes et al. (2022) presented a new numerical model for the simulation of variably saturated flow and contaminant transport.The model is suitable for modeling dense contaminant transport in unsaturated porous media for large-scale problems.The developed model was tested using three test cases.Case 1 was based on the laboratory experiments from Vauclin et al. (1979) which investigated the transient position of the water table under artificial recharge.The model consists of a rectangular sandbox (0.60 m × 0.20 m) with the water table initially located 0.65 m from the bottom.Initially, hydrostatic pressure distribution and zero pollutant concentration are imposed.A contaminant solution (densities: 1,000 and 1,100 kg m −3 , respectively) is infiltrated at a constant flux of 86.4 cm/day at 0.2 m in the center of the soil surface.A constant head of 0.65 m is applied below the water table.Case 2 deals with the infiltration of dense contaminant solution in a heterogeneous initially dry soil (Forsyth and Kropinski 1997).The model domain is a rectangular model (1.25 m × 2.3 m) containing two horizontal layers.The top 0.4 m consists of a clayey material while the bottom 1.9 m is sandy soil.A constant pressure head of -0.10 m is prescribed from 0 ≤ x ≤ 0.2 m at the model surface.At the bottom of the model, a constant pressure head of 10 -4 cm is applied.The infiltrated contaminant solution has a density of 1,025 kg m −3 .Case 3 is a simplified 2D conceptual model of the Akkar unconfined coastal aquifer in Lebanon, in which salt-water intrusion combined with climate change projections and long-term pumping regimes were modeled.The model has a length of 8 km and a varying depth from 100 to 170 m.A heterogeneous permeability field with an average value of 0.945 × 10 -11 m 2 and a variance of 1.0 m 4 was used.
A summary of existing variable-density flow benchmarks is given in Table 1.To the authors' knowledge, flow dynamics and solute transport in free convective regimes under variably saturated and variable-density conditions are physical phenomena, for which a mathematical benchmark is absent.
The objective of the present study is therefore to define a new numerical benchmark for free solutal convection under variably saturated conditions based on the physical experiments presented by Simmons et al. (2002).
This paper is structured as follows.Section 2 describes the modeling concept used.The variable-density flow and solute transport physical experiments conducted by Simmons et al. (2002) used in the present study as well as the governing equations are described.A brief description of HGS (Therrien et al. 2004) and PEST (Doherty 2004(Doherty , 2010) is also given.The conceptual model is described, and the perturbation method used to mimic pore-scale heterogeneity that triggers dense plume fingering is explained.Section 3 describes the sensitivity of the plume penetration height when chosen parameters are varied.A grid convergence analysis in space and time is conducted to determine the adequate spatial-temporal resolution, and the parameter calibration process is explained.Section 4 describes the properties (e.g.grid discretization, calibrated parameters) of the new numerical benchmark.Section 5 describes how to set up the numerical benchmark, and finally, in section 6, conclusions and a summary of the paper are given.

Physical model
This study focuses on four cases of the free convective experiments conducted by Simmons et al. (2002): • Unsaturated high-density case (UHD) • Saturated high-density case (SHD) • Unsaturated medium-density case (UMD) • Saturated medium-density case (SMD) The mathematical conceptualization of these experiments is shown in Fig. 1.

Mathematical model: governing equations
Variable-density variable-viscosity flow in unsaturated and saturated porous media is described by the Darcy equation (Bear 1988): where  where p [M L −1 T −2 ] is fluid pressure and g [L T −2 ] is gravitational acceleration.No large pressures are present in the sand tank allowing the assumption of fluid and matrix incompressibility, which implies that the specific storage coefficient be zero and that porosity ϕ [-] be constant.Therefore, the mass conservation equations for fluid and solute mass fraction ω [-] are given by (Bear 1988): where S w [-] is water saturation, t [T] is time, and D [L 2 T −1 ] is the hydrodynamic dispersion tensor defined as (Bear 1988): where α L [L] and α T [L] are longitudinal and transversal dispersivities, respectively, τ [-] is tortuosity, and D d [L 2 T −1 ] is the free-solution diffusion coefficient.The relation between water saturation S w and freshwater pressure head ψ 0 is given by van Genuchten (1988): (2) where S wr [-] is residual saturation, and α The relation between water saturation S w and relative permeability k rw is given by van Genuchten (1988) based on Mualem (1976): where S e = (S w − S wr )/(1 − S wr ) [-] is effective saturation, and l p [-] is pore-connectivity parameter.Mualem (1976) estimated that l p is 0.5 for most porous materials (Therrien et al. 2004).

HydroGeoSphere model
The HydroGeoSphere (HGS) model was used to conduct the simulations in the present study.HGS is a 3D numerical model that simulates variable-density, variable-viscosity variably saturated flow and transport.HGS applies the control volume finite element (CVFE) method to discretize the fluid mass conservation equation, the Galerkin finite element for the solute mass conservation equation, and a finite difference Eulerian scheme to discretize the time derivatives (Graf and Therrien 2005;Therrien et al. 2004).HGS linearizes the non-linear unsaturated equations with the Newton-Raphson method.Additional nonlinearity caused by density variations is linearized with the Picard method (Therrien et al. 2004).An adaptive time-stepping scheme is used here where time-step sizes are calculated based on changes in the solute mass fraction (Graf and Degener 2011).More details on HGS are given in Therrien et al. (2004).

PEST
The model-independent parameter estimation and uncertainty analysis computer program PEST (Doherty 2004, ( 6) 2010) is used for the parameter calibration.PEST uses a nonlinear estimation technique called the Gauss- Marquardt-Levenberg-method (Levenberg 1944;Marquardt 1963).PEST adjusts the HGS parameters so that discrepancies between selected HGS outputs and corresponding measurements made by Simmons et al. (2002) are minimized.PEST runs HGS as many times as necessary to find the optimal parameter set (Doherty 2004(Doherty , 2010)).More details on PEST are given in Doherty (2004Doherty ( , 2010)).

Conceptual model
Geometry, initial and boundary conditions of the numerical model are shown in Fig. 1 (adopted from Cremer and Graf 2015).Right, left and bottom sides are no-flow boundaries.For the saturated cases, a constant freshwater pressure head boundary (Dirichlet) condition of ψ 0 = 0.0042 m (adopted from Cremer and Graf 2015) is imposed along the solute source to enable saltwater ponding, and ψ 0 = 0.0 m along both sides of the source.For the unsaturated cases, a time-varying specified flux (Neumann) boundary condition (adopted from Simmons et al. 2002) is prescribed along the solute source, and seepage nodes are placed at the top domain corners to allow water outflow in the case that the water table reaches the top such that water flows out of the domain.A constant solute mass fraction (ω max ) of 0.2535 and 0.05178 for high-and medium-density cases, respectively, is imposed at the solute source, all other boundaries are zero-dispersive mass flux boundaries.Initial conditions are ψ 0 = 1.06 mz for the saturated cases and ψ 0 = 0.62 mz for the unsaturated cases.Table 1 shows the material parameter set used for the numerical simulations.The uncalibrated material parameter set will be used to conduct an error analysis to determine an adequate spatial-temporal resolution and will be calibrated for the benchmark definition.
Figure 2 shows the physical convective patterns and nonperturbed numerical simulations for UHD and SHD, respectively.Clearly, the convective patterns are not realistically reproduced by the numerical model.To adequately trigger fingering caused by pore-scale heterogeneity in the porous medium, random hydraulic conductivity fields (random K 0 -fields) are used (Cremer and Graf 2015).As in Cremer and Graf (2015), the random K 0 -fields are generated using a random K 0 -field generator provided by Olaf Cirpka (University of Tübingen, Germany).Each K 0 -value is assigned to an element in the numerical grid.The K 0 -values are uncorrelated isotropic fields (white noise) because the physical experiments used a porous material that is isotropic and homogeneous.Uncorrelated fields are created by choosing correlation lengths smaller than the cell size of the spatial grid (Cremer and Graf 2015).Sampling the generated random K 0 -fields and applying the semivariogram function produces a semivariogram with a range = 0.This indicates that the sampled points have no relationship to each other, which means that the hydraulic conductivities may vary over short distances.

Error analysis
The UHD case is selected to conduct the grid convergence test because it is expected that the UHD case requires the finest spatial-temporal grid.The adequate spatial and temporal discretization is determined for the UHD case using a grid convergence test where a series of spatial-temporal grids is used for the numerical simulations.Here, g ς,θ represents the numerical grid at spatial and temporal discretization levels ς and θ, respectively.

Spatial discretization
Spatial discretization is investigated using a hierarchy of uniformly refined grids.Local grid refinement is not used in this study.The domain is spatially discretized by using 3D cuboidal elements.Elements at the spatial discretization level ς (ς = 0, 1, ..., 5) have the size (in meters): Increasing ς corresponds to refining the spatial grid.Thickness of the single element in the y-direction is Δy = 0.053 m for all ς, which is a 2D representation of the physical experiment (Cremer and Graf 2015).

Temporal discretization
For a given time discretization level θ (θ = 0, 1, ..., 5), timestep sizes are calculated according to the rate of change of the solute mass fraction on node I as (Therrien et al. 2004): where L and L + 1 are the current and new time levels, respectively, and ω * (θ) = ω max /2 θ [-] is the maximum allowed change in solute mass fraction during a time step.Increasing θ gives smaller ω * (θ) values and smaller time-step sizes. (10)

Numerical accuracy
Numerical simulations were carried out using the different g ς,θ , and the solute mass flux J at the z-elevation 0.8480 m is observed, which is approximately at the half thickness of the initially unsaturated zone.J is chosen since it is a spatially integral value.J was observed for UHD at output times 6, 9, 12, 15, 18, 20, 26, 32, 38, 40, 44, 46, 48 min.The J values for g ς,θ at output times t out are denoted as J t out , and written in vector form as: As in Graf and Degener (2011), the L 2 -norm is used to calculate the relative spatial error ε ς+1/2,θ [-] between numerical solutions with increasing spatial discretization levels, and it is defined as (Johannsen et al. 2002): Similarly, the relative temporal error ε ς,θ+1/2 [-] between numerical solutions with increasing temporal discretization levels is defined as: Grid convergence is defined to be achieved when both relative discretization errors are smaller than the threshold value of 0.01 (1%) (Graf and Degener 2011).
Figure 3 shows the simulated solute distribution and the calculated relative discretization errors [%] at t = 40 min for UHD.Spatial discretization errors (Eq.14) and temporal discretization errors (Eq.15) are given by the numbers depicted horizontally and vertically, respectively.
In all cases shown in Fig. 3, fingering is generated by the random K 0 -field.Clearly, the solute distribution depends on the spatial discretization level, while temporal ( 13) discretization appears to have a minor effect on solute distribution.Unphysical solute mass fraction values are depicted in black (overshoots) and white (undershoots), and are observed at spatial discretization levels ς < 4 and temporal discretization levels θ < 4. The range of over-and undershoots reduces as the discretization level in space and time increases.No unphysical results are observed for spatial discretization levels ς ≥ 4 and temporal discretization levels θ ≥ 4. The widely used Pecletcriterion for numerical stability (Pe g = Δx/α = 1.785 ≤ 2) is fulfilled with that spatial discretization.

Definition of the new benchmark problem
The new benchmark problem is defined by comparing the experimentally determined plume penetration heights z lab [L] given by Simmons et al. (2002) with the numerically simulated plume penetration heights z num [L].Here, plume penetration height z num is defined as the z-position of the deepest fingertip that has the solute mass fraction of 70% of ω max unless otherwise stated.The choice of 70% of the solute mass fraction was based on numerical trials where different values for the solute mass fraction were tested.

Parameter sensitivity
Grid g 4,4 is used for the sensitivity analysis.Sensitivity of z num with respect to the problem parameters is analyzed to identify the relevant parameters that have to undergo calibration using PEST (section 'PEST').Uncalibrated base parameter values are given in Table 1.Porosity ϕ is assumed to be adequately estimated by Simmons et al. (2002) and is therefore excluded from the sensitivity test.
Simulations have demonstrated that α T has no significant effect on z num (results not shown here), such that α T will also not be part of the PEST calibration.The value of α T slightly influences the lateral dissipation of the solute distribution, which influences the width but not the length of the plume fingers.A base parameter set vector p is defined as p = {K 0 , α L , D d , S wr , α, n}.The z num produced by parameter set p at time t is denoted as z num (p, t).Furthermore, z num values at output times t out = t 1 …t n = 6, 9,12,15,18,20,26,32,38,40,44,46, 48 min are written in vector form as Changes of Z num caused by a change in a parameter from p are quantified using the dimensionless elasticity index EI [-].EI ranges from 0 to 1 and gives the relative change for each member of Z num with respect to the relative change in parameter value p i from its base value p 0 i .Large values of EI for a given parameter value p i indicate a large change of Z num .EI is defined as (Loucks and van Beek 2005) where SI is the sensitivity index defined as (Mccuen and Snyder 1983): and b [%] is the change of parameter value p i from p 0 i , and n p = 6 represents the total number of parameters.
The L 2 norm is used to quantify the relative variation ε i [-] caused by parameter variation on plume penetration height z num for UHD at output times t out = 6, 12, 15, 20, 26, 32, 40, 46 min.ε i is defined as: For UHD, Fig. 4a, b shows the relative variations ε i of z num caused by ±30% variations in K 0 , α L and D d .Clearly, z num is highly dependent on K 0 , while α L and D d do not impact z num .This result shows that K 0 has to be included in the parameter calibration process, while α L and D d do not.
Figure 4c,d shows the relative variations ε i of z num caused by ±20%, ±11% and ±22% variations of the van Genuchten parameters α, n, and S wr , respectively.The van Genuchten parameters were varied according to the sand soil type values given by Carsel and Parrish (1988).The results suggest that α and n slightly influence z num , while the residual saturation S wr has a negligible influence on z num .This suggestion is clearer in Fig. 5 where the temporal evolution of EI is shown.Clearly, K 0 and n are the parameters that have the largest influence on z num .In addition, z num appears to be influenced by α but only insignificantly by D d , α L and S wr .These results indicate that, in case of unsaturated conditions, z num is most significantly controlled by K 0 , α and n, while α L , D d and S wr have weak impacts on z num and are therefore not included in the calibration.

Parameter calibration with UHD
The parameter calibration is carried out using the measured z lab values for the UHD case (Simmons et al. 2002).The calibration requires coupling the random K 0 -field generator (section 'Conceptual model') with HGS (section 'Hydro-GeoSphere model') and PEST (section 'PEST').The calibration procedure is as follows: 1.A random K 0 -field is generated using the initially uncalibrated K 0 (Table 1) with a given variance σ and correlation length (Cremer and Graf 2015).2. The UHD case is numerically simulated using HGS taking the generated random K 0 -field and initial parameter vector p as input.3. PEST compares the simulated z num values with the measured z lab values and modifies K 0 , α and n in order to minimize the objective function used by PEST, which is the sum of the weighted squared residuals between simulated z num and measured z lab (Doherty 2004(Doherty , 2010)).
A maximum allowed change of the objective function of 1% was used as convergence criteria.
Table 1 shows the parameters to be calibrated and the ranges in which the parameters are allowed to vary.The uncalibrated K 0 value ranges were taken from Simmons et al. (2002).The van Genuchten parameters n and α ranges for sand were taken from Carsel and Parrish (18) 1988).Parameter calibration was carried out using grid g 4,4 as suggested by the grid convergence test.Random K 0 -fields are created using a variance of σ = 10 −2 [m 2 s −2 ] and correlation length in x-and z-direction is 0.0021 [m].
The correlation length is slightly smaller than the element sizes from g 4,4 creating uncorrelated isotropic K 0 -fields of white noise.The parameter calibration lead to the calibrated parameter values K 0 = 0.00112 m s −1 , n = 2.39 and α = 11.6 m −1 .Incorporating random K 0 -fields can likely affect the simulated plume penetration height z num .To assess the effect of random K 0 -fields on z num , a defined number of stochastic numerical simulations is needed to produce statistical results (mean z num and standard deviation z num ) in order to analyze the dependency of z num on the K 0 -field (Xie et al. 2011).Here, z num is the mean plume penetration and z num is its variability.The central limit theorem in the probability theory (Kreyszig 1988) states that the number of generated K 0 -fields should be large enough (here 30) to obtain a Gaussian distribution (Xie et al. 2011).Therefore, the UHD problem was simulated with the calibrated parameters 30 times using 30 different random K 0 -fields in order to determine z num (Xie et al. 2011).Figure 6 shows the fingering pattern of the physical experiment, and five different numerical convective patterns at t = 32 min and 40 min.Overall, the numerical convective patterns are in very good agreement with the physical convective patterns.A measure of the goodness of fit between the observed z lab and numerically simulated z num penetration heights is provided by the correlation coefficient r, which is defined as (Cooley and Naff 1998): where z lab [L] are the measured observations of the plume penetration height, z num [L] are the numerical mean values of the plume penetration height (mean values of the 30 realizations), z num [L] is the mean of z num , and z lab [L] is the mean of z lab .Hill (1998) stated that r should be greater than 0.90 so that the fit between observation and model output is considered acceptable.Deviations between z lab and z num at output times t out (explained in section 'Numerical accuracy') are quantified using the L 2 -norm as: For UHD, Fig. 7 shows the temporal evolution of z lab , z num and z num corresponding to z num for five realizations of the random K 0 -field.The simulated values adequately follow the trend of the measured values.Variability is considered small and is not expected to increase significantly if more simulations were conducted.The numerical model reproduces adequately the trend of the observed data and a good agreement between observed and simulated data is achieved.

Parameter validation with SHD, UMD, SMD
The calibrated parameters are validated with the cases SHD, UMD and SMD.Random K 0 -fields were generated using the calibrated K 0 and using z num values of 10 −3 , 10 −1 and ( 19) 10 −3 , for SHD, UMD and SMD, respectively, as suggested by Cremer and Graf (2015).Moreover, the same correlation length as for UHD using grid g 4.5,4 was used.Figure 8 shows the physically and numerically simulated convective patterns for SHD, UMD and SMD at selected times.For brevity, only the results of two realizations of the K 0 field are shown.Figure 8 demonstrates that convective pattern, plume migration, finder generation, and plume penetration height are simulated very realistically in all three scenarios.For SHD and SMD, the simulated number of fingers as well as the horizontal and vertical extent of the plume agree very well with the results from the experiment.For UMD, the numerical results confirm that density effects are not the  8 Fingering pattern for the physical experiments and two numerical realizations for SHD, UMD and SMD at selected times dominant driving force in the unsaturated zone because the relative permeability k rw is very low due to the very low water saturation.Instead, brine flow is dominated by capillary forces creating an almost symmetric half-circle of dyed brine, similar to the UHD case (Fig. 6).In conclusion, there is a very good agreement between physical and numerical results for all three validation cases SHD, UMD, SMD.
Figure 9 shows the temporal evolutions of z num , corre- sponding to z num , and z lab for SHD, UMD and SMD. Figure 9 demonstrates that, calibrated parameters obtained from the UHD case reproduce the other three validation cases (SHD, UMD and SMD) very adequately.The parameter calibration process is therefore considered to be validated.Plume penetration heights were simulated very similarly for all cases.The variability of the simulated z num curves is low, and it is therefore not expected that in any of the cases the spread z num will increase if more realizations are conducted.

Application of the benchmark problem
The following procedure should be followed to apply the numerical benchmark: 1. Select one case, ideally UHD or alternatively SHD, UMD or SMD. 2. Generate one random-K 0 field using a grid with spatial discretization Δx = 0.002142 and Δz = 0.002120 m, the calibrated K 0 value (Table 1), corresponding variance σ (σ = 10 −2 , 10 −3 , 10 −1 , 10 −3 for UHD, SHD, UMD and SMD, respectively) and correlation length of 0.0021 m in both the x-and z-directions using a maximum change of solute mass fraction equivalent to 6.3% of ω max for each time step.3. Carry out numerical simulations of the selected case using the calibrated parameters (Table 2), corresponding boundary and initial conditions (Fig. 1) and the random K 0 -field.Create an output of the finger height having a solute mass fraction of 70% of ω max .
4. Calculate an average penetration height and compare with the corresponding calibrated curve.The corresponding curves for UHD that are depicted in Figs.7 and 9 show the curves for SHD, UMD, SMD. 5.The selected case is correctly reproduced if the calculated mean value of the penetration height falls in between the z p ± z p limits from the corresponding curve.

Summary and conclusions
This present study introduces a new numerical benchmark of free convective variable-density flow and transport in saturated and unsaturated porous media.The new benchmark is based on the physical experiments conducted by Simmons et al. (2002) which consisted of a homogeneous  sand tank for which four cases were conducted (two saturated and two unsaturated cases).Calcium chloride solutions with two different densities were introduced along the top for each of the cases.The plume penetration height as well as the solute distribution were recorded at selected times for the four cases (Simmons et al. 2002).
The HydroGeoSphere flow and transport numerical model was used to numerically reproduce the physical experiments.The governing fluid and solute mass conservation equations were expressed in terms of equivalent freshwater pressure head (ψ 0 ) and solute mass fraction (ω), respectively.Plume fingering was triggered with random K 0 -fields, as suggested by Cremer and Graf (2015).A grid convergence test in both space and time was conducted, observing the change in the solute mass flux at a given depth in order to determine adequate grid resolution.A sensitivity analysis was completed to obtain the parameter set that underwent calibration.The parameter calibration process was carried out using PEST (Doherty 2004(Doherty , 2010)).The major key findings of this study are: 1.A new numerical benchmark for coupled water flow and brine transport is presented.In the new benchmark problem, water saturation, fluid density and fluid viscosity are spatially and temporally variable.This results in a high complexity of relevant processes and their physical interactions.2. Developing the benchmark showed that the brine flow is highly sensitive to the equivalent freshwater hydraulic conductivity K 0 , longitudinal dispersivity α L , and van Genuchten parameters α and n. 3. The verification case (UHD), as well as the validation cases (SHD, UMD, SMD) very realistically and adequately simulate the coupled brine flow and salt transport under saturated-unsaturated conditions.
In conclusion, this study presents a variable-density flow and transport numerical benchmark dealing with free convective problems under variably saturated conditions.The new numerical benchmark is useful for code validation as well as for code comparison.Further work is required to focus on getting a better understanding of fluid flow and solute transport in the unsaturated zone, in which matric potential controls infiltration rates.In the vadose zone, the separate analysis of the distribution of both water saturation and solute mass fraction in space and time is suggested.This would be especially relevant for studies which investigate thicker unsaturated zones.

Fig. 1
Fig. 1 Conceptual model, initial conditions and boundary conditions of the Simmons problem.Left, right and bottom boundaries are impermeable to flow and transport.The initial water level for the unsaturated cases (UHD, UMD) is at z = 0.62 m

Fig. 2
Fig. 2 Fingering pattern for the physical experiment and unperturbed simulations at t = 40 min for a-b UHD and c-d SHD

Fig. 3
Fig.3Relative errors for selected output times (refer to section 'Numerical accuracy') and convective patterns for UHD at t = 40 min for all spatial and temporal grid discretization levels.Grid conver-

Fig. 6
Fig. 6 Fingering pattern for the physical experiment (a) and five numerical realizations (R) (b-f) for UHD at t = 32 and 40 min

Fig. 9
Fig. 9 Temporal evolution of the mean plume penetration height z num [m], standard deviation σ znum [m] and range (z nummax [m] and z nummin [m]) for all the numerical realizations and the observed penetration heights z lab for a SHD, b UMD and c SMD

Table 1
Previously performed laboratory experiments (Exp.) of variable-density flow and transport, and corresponding definitions of benchmark problem

Table 2
Material parameter values for all experiments, calibrated with the results from UHD