Constraining probabilistic chloride mass-balance recharge estimates using baseflow and remotely sensed evapotranspiration: the Cambrian Limestone Aquifer in northern Australia

Regional-scale estimates of groundwater recharge are inherently uncertain, but this uncertainty is rarely quantified. Quantifying this uncertainty provides an understanding of the limitations of the estimates, and being able to reduce the uncertainty makes the recharge estimates more useful for water resources management. This paper describes the development of a method to constrain the uncertainty in upscaled recharge estimates using a rejection sampling procedure for baseflow and remotely sensed evapotranspiration data to constrain the lower and upper end of the recharge distribution, respectively. The recharge estimates come from probabilistic chloride mass-balance estimates from 3,575 points upscaled using regression kriging with rainfall, soils and vegetation as covariates. The method is successfully demonstrated for the 570,000-km2 Cambrian Limestone Aquifer in northern Australia. The method developed here is able to reduce the uncertainty in the upscaled chloride mass-balance estimates of recharge by nearly a third using data that are readily available. The difference between the 5th and 95th percentiles of unconstrained recharge across the aquifer was 31 mm/yr (range 5–36 mm/yr) which was reduced to 22 mm/yr for the constrained case (9–31 mm/yr). The spatial distribution of recharge was dominated by the spatial distribution of rainfall but was comparatively reduced in areas with denser vegetation or finer textured soils. Recharge was highest in the north-west in the Daly River catchment with a catchment average of 101 (61–192) mm/yr and lowest in the south-east Georgina River catchment with 6 (4–12) mm/yr.


Introduction
Groundwater recharge is one of the most difficult components of the water balance to estimate as it is impossible to directly measure and must be inferred from other measurements. It is often recommended to use multiple methods when estimating recharge to acknowledge the inherent uncertainty in estimating something that cannot be measured directly (Scanlon et al. 2002). When multiple methods have been used to estimate recharge in a field study, it is rarely undertaken as any more than a comparison of methods (Sibanda et al. 2009;Yin et al. 2011;Walker et al. 2018;Flint et al. 2002).
The chloride mass balance (CMB) method (Anderson 1945) is the most widely used approach for estimating recharge, both globally (Scanlon et al. 2006) and in Australia (Crosbie et al. 2010). It is popular because it is robust over many climate zones and is cost effective, requiring only analyses of chloride in groundwater and rainfall. It is relatively insensitive to the mechanism of recharge, whether that be diffuse recharge through the soil matrix, by-pass flow through macropores or even captured surface water in sinkholes (Alcalá et al. 2011;Bazuhair and Wood 1996). At its simplest, recharge is estimated as: where R is average annual net recharge (mm/year), D is the average annual chloride deposition (kg/ha/year), Cl gw is the chloride concentration of the groundwater (mg/L) and the multiplier of 100 is a unit conversion factor.
The chloride concentration of groundwater can generally be measured fairly easily with a high degree of precision but the chloride deposition due to rainfall is far more uncertain (Naranjo et al. 2015). There is then more uncertainty introduced through upscaling from the point scale to the regional scale. There are few examples in the literature where the uncertainty in regional recharge estimates using the chloride mass balance have been formally explored. Some examples include: a linear error propagation of the standard deviations of each of the input variables interpolated to a gridded basis across Spain Custodio 2014, 2015); Monte Carlo sampling of a Pearson Type III distribution of chloride deposition with a log-normal distribution of chloride concentration on a gridded basis in south-east South Australia (Davies and Crosbie 2018); and, regression kriging using stochastically generated chloride deposition, chloride in groundwater, chloride exported in runoff and the bootstrapping the regression equations across eastern Australia .
Being able to quantify the uncertainty in recharge estimates is a first step; being able to reduce the uncertainty then becomes useful. In modelling, the problem of nonuniqueness in parameter estimation leads to predictive uncertainty (Beven 1993), which can often be reduced by constraining with different types of observations (Xie et al. 2018(Xie et al. , 2017. A similar idea was used recently to constrain point estimates of recharge using the water-table fluctuation method using additional observations in the form of chloride mass balance estimates of recharge and a water balance using remotely sensed evapotranspiration ). This same process should be applicable at a regional scale.
Within a suitably long time period, the groundwater recharge will equal the groundwater discharge. The chloride mass balance estimates of recharge must be greater than the baseflow of the catchment as the baseflow is not the total groundwater discharge (which also includes evapotranspiration from the groundwater, extraction and groundwater flow out of the catchment). Similarly, the excess water (rainfall minus evapotranspiration) must be greater than the recharge as the excess water also includes the runoff component of the water balance. The baseflow and excess water are complementary measurements that can help constrain the lower and upper end of the recharge distribution estimated by the chloride mass balance.
The objectives of this report are to: (1) develop a method for constraining the upscaled probabilistic estimates of recharge from the chloride mass balance using estimates of baseflow and remotely sensed evapotranspiration; (2) to demonstrate the method over the entirety of the extent of the Cambrian Limestone Aquifer in northern Australia;and (3) to thoroughly assess the assumptions in the methodology and their impact on the uncertainty in the recharge estimates.

Study area
The area chosen for this study is the outline of the contiguous Cambrian Limestone Aquifer (CLA) modified from the Hydrogeological Provinces of Australia (Jacobson and Lau 1987). It covers 570,000 km 2 , straddling the Northern Territory (NT)-Queensland (Qld) border in northern Australia (Fig. 1b).
The climate in the north of the CLA is tropical grading to arid in the south. Three main townships within the CLA are Katherine, Tennant Creek and Boulia. Katherine has an annual average rainfall of 970 mm with 96% falling in the wet season between November and April, Tennant Creek has 465 mm/ year of which 90% falls in the wet season and Boulia has 260 mm/year with 76% in the wet season (BoM 2020). Although there is a regular wet season every year, there is still substantial interannual variation in the rainfall- Fig. S1 of the electronic supplementary material (ESM). Potential evapotranspiration (FAO56, Allen et al. (1998) exceeds rainfall everywhere in the study area on an annual basis but in the north the rainfall is higher during some months of the wet season.
The area has little perennial surface water but contains the headwaters of the Daly and Victoria rivers that flow east to the Timor Sea, the Roper, Limmen Bight, Robinson, McArthur, Calvert and Nicholson rivers that flow to the Gulf of Carpentaria, and the Georgina River that flows into Lake Eyre (Fig. 1a). The study area also contains the Wiso and Barkly internally draining areas (AWRC 1997). The CLA has discharge points in major spring complexes in the Daly, Roper and Nicholson catchments that provide perennial flow downstream. There are also other minor springs over the study area, some that are sourced from the CLA and others from older sediments (Fig. 1a).
The CLA comprises the Daly, Wiso and Georgina basins ( Fig. 1d) which are comprised of silicified limestone, dolomitic siltstone, fine-grained sandstone and dolomite (Randal 1967(Randal , 1973. Particularly near areas of outcrop, karstic features are evident at the surface in the form of sinkholes and caves which can act as foci for localised recharge (Yin Foo and Matthews 2001). At depth, cavities can act as rapid conduits for groundwater flow (Karp 2005). Much of the CLA is overlain by the Cretaceous aged Carpentaria Basin. Although mostly unsaturated the siltstones of the Carpentaria Basin are thought to impede recharge to the underlying CLA, particularly when their thickness exceeds 25 m (Tickell and Bruwer 2017). Much of the surface of the study area is composed of Cenozoic regolith and Quaternary sandplains and alluvium (Fig. 1c). In some areas the CLA is absent (particularly the Tennant Creek inlier between the Wiso and Georgina Basins) and the Proterozoic McArthur Basin sediments outcrop (Fig. 1c,d). A thorough review of the hydrogeology of the area can be found in .
The CLA is the primary water source for the region. The area is sparsely populated with the major centres of Katherine and Tennant Creek having populations of 6,303 and 2,991 respectively and all other towns having a population below 1,000 (ABS 2016). Most of the water use in the Georgina and Wiso Basins is for stock watering in the pastoral industry but the Daly Basin is more developed with demands on the water resource for irrigation (NRETAS 2010). The motivation for the current study is the potential for future competition for water resources if the shale gas resources of the Beetaloo Sub-basin ( Fig. 1) are developed, as the Beetaloo Sub-basin occurs at a depth of 1-4 km below the CLA (Hall et al. 2020). Shale gas development requires a water supply for drilling and fracking (Huddlestone-Holmes et al. 2020) that could potentially impact upon current users of the water (including cultural and environmental) and any future expansion of more intensive forms of agriculture.
Groundwater recharge has previously been studied across parts of the CLA; this has involved using several different methods such as CMB, water-table fluctuations, baseflow analysis and tracers (Table 1). The application here is an improvement from previous work as it covers the footprint of the CLA in its entirety using a consistent method and includes a thorough examination of the uncertainty in the recharge estimates. It should be noted that this study is estimating recharge over the footprint of the CLA, not to the CLA itself. The water table is spatially variable across the study area, depending upon the location it can sometimes be hosted in hydrogeological units either above or below the CLA. There are also parts of the study area where the CLA is absent.

Methods
The method used here can be simplified down to four steps: 1. Probabilistic estimates of recharge are made using the chloride mass balance and upscaled to a regular grid using regression kriging. This is the unconstrained CMB estimate of recharge. 2. Baseflow separation is performed on streamflow data from selected catchments to provide a lower constraint on the CMB estimates of recharge. 3. A water balance (rainfall-evapotranspiration) is conducted over selected catchments using remotely sensed evapotranspiration and gridded rainfall to estimate the excess water (recharge + runoff). This provides an upper constraint on the CMB estimates of recharge. 4. The CMB estimates of recharge are constrained using a rejection sampling procedure to reduce the uncertainty in the CMB estimates of recharge and create the constrained estimates of recharge.
The methods and results sections of this report are arranged around these four steps.

Chloride mass balance
The method applied involves estimating net recharge at a point scale using the chloride mass balance and then upscaling the point estimates to a regular grid across the study region using regression kriging. The upscaling used here is combining the multiple linear regression approach used in other areas across northern Australia (Turnadge et al. 2018;Taylor et al. 2018a, b) with the regression kriging approach that was Using CMB, average recharge was found to be 1.7 mm/year in the south, 5.0 mm/year in the centre and 7.4 mm/year in the north. The northern part was further split into east and west with 2.0 and 12.8 mm/year respectively. Using tritium analysis, the recharge was found to be 66, 140 or 190 mm/year depending on the method used in the NW of the Beetaloo region. This was the recharge rate in sinkholes not an areal estimate Entire CLA This study Used probabilistic estimates of recharge from CMB upscaled to the entire CLA on a 2,500-m grid. The uncertainty in the recharge estimates was constrained using observations of baseflow and remotely sensed evapotranspiration. This is the first study to estimate recharge over the entirety of the CLA and to thoroughly consider the uncertainty in the recharge estimates successfully used in eastern Australia ). The recharge is estimated probabilistically using 1,000 replicates of the input parameters.

Point-scale estimates of recharge
The chloride mass balance is possible because chloride in rainfall is excluded from evapotranspired water and so concentrates in the root zone. This more concentrated water then leaches downwards to the water table to become recharge. If the chloride exported through surface runoff is taken into account, Eq. (1) becomes : where RC is the runoff coefficient and is a scalar.
The chloride deposition rate was obtained from a national dataset Crosbie 2014, 2018), which mapped the chloride deposition (both wet and dry) from 300 points by fitting to a model relating the chloride deposition to the distance from the coast (Keywood et al. 1997). The mean (μ), standard deviation (σ) and skewness (g) from 1,000 replicates are used to provide a Pearson Type III distribution of chloride deposition at each point location in the study area (mean is shown in Fig. 2a). The deposition (D) for the ith location is calculated according to the Pearson Type III distribution (Pilgrim 1987): where K Yi is a frequency factor calculated from g i and a standard normal deviate (z): The standard normal deviate is generated stochastically for each replicate to generate the distribution of chloride deposition rate for each point location.
Information on the chloride concentration of groundwater is primarily held by the government agencies responsible for water management across the study area, the Northern Territory Government Department of Environment and Natural resources and the Queensland Government Department of Natural Resources, Mines and Energy. Additional data have been sourced from resources companies operating in the area and Commonwealth Government agencies. This study identified 4,296 bores within the study area that have been analysed for the chloride concentration of groundwater at least once, and of these, 3,575 were used to estimate recharge using the chloride mass balance (Fig. 1a). The bores that were excluded were due to some basic quality assurance criteria: & Bores that had a chloride concentration below 2 mg/L or above 2,000 mg/L were either unrepresentative or had their values mistranscribed and so were excluded & Bores with a drilled depth greater than 150 m were considered unlikely to be in the water-table aquifer and so were excluded & Bores located within surface-water flowlines were excluded. Flowlines are usually where the water table is shallowest and potentially subject to evapoconcentration of the groundwater.
The runoff coefficient in Eq.
(2) is as calculated by the AWRA model over the period 1910-2015 at a national scale (Vaze et al. 2013) and is extracted for the location of each of the 3,575 bores. The proportion of chloride exported in surface runoff will be less than the runoff coefficient because it is only the quickflow component that is relevant, and the quickflow is generated by the high intensity rainfall events that generally have a lower-than-average chloride concentration. The scalar has a stochastically generated value between 0.33 and 0.66 for each replicate to account for the unmeasured chloride exported in runoff.

Upscaling using regression kriging
The upscaling of the point estimates of recharge using regression kriging ((Hengl et al. 2004) includes three steps: (1) developing a regression relationship between the point estimates of recharge and the covariates to predict recharge across the study area on a regular grid, in this case 2500 m; (2) kriging the residuals between the point estimates of recharge and the regression estimates of recharge to provide a surface of residuals on the same grid; and (3) adding the residual surface to the regression surface, which provides a spatial estimate of recharge that is informed locally by point data and away from point data is dependent upon a global relationship predicted by covariates. This process is repeated for 1,000 stochastic replicates to quantify the uncertainty in the recharge.
Previous studies have shown that recharge is better approximated by a log-normal distribution rather than a normal distribution (Cook et al. 1989;Eriksson 1985), and this extends to its relationship with rainfall (Petheram et al. 2002). Recharge estimates are mainly dependent upon rainfall, soil type and vegetation (Crosbie et al. 2010;Kim and Jackson 2012;Scanlon et al. 2006) and these have been used successfully as covariates to upscale modelled point estimates of recharge  and are also used here. The rainfall used is the long-term annual average over the period 1910-2018 (Jones et al. 2009) shown in Fig. 2b. The average clay content of the top 2 m of soil has been shown to be an effective predictor of recharge (Wohling et al. 2012) and so has been used here (Fig. 2c) as a predictor using the Soil and Landscape Grid of Australia gridded clay content data (Grundy et al. 2015). The Normalised Difference Vegetation Index (NDVI) using AVHRR data (BoM 2019) is used as a measure of vegetation differences spatially as a long-term average over the period 1993-2012 and is shown in Fig. 2d.
To reduce the influence of outliers in the dataset, bootstrapping was used (Efron and Tibshirani 1994) to select the data points that would be used in the regression equation for each of the 1,000 replicates. This meant that 3,575 points were selected with replacements from the full dataset of 3,575 points; in this way each replicate is highly unlikely to include each point and will include some points multiple times and other points not at all.
It was found that the recharge was related to the rainfall (P) through a quadratic equation while a linear relationship was found to be adequate for the soils (C) and vegetation (V). This leaves the regression equation to be fitted as: where β 0 , β 1 , β 2 , β 3 , and β 4 , are fitting parameters fitted through least squares regression. The topographic slope and the depth of regolith have been suggested as important factors influencing recharge as they influence the split from rainfall between infiltration and surface runoff. This was not observed in the data set from the study area, probably due to a lack of bores drilled on steep slopes or very shallow regolith. These factors were still included in the upscaling through reducing the predicted recharge on slopes greater than 2% and for depths of regolith less than 2 m. This is the same process that has previously been used across northern Australia (Turnadge et al. 2018;Taylor et al. 2018a, b) and only affects very small parts of the study area (Fig.  2e,f).
The difference in the recharge estimated through the regression equations and through the point-scale chloride mass balance were calculated for each replicate. These residuals were fitted to a spherical semivariogram using gstat (Pebesma 2004) in R (R Core Team 2016) and then kriged to a 2,500-m grid to create a residuals surface. The residuals surface was then added to the recharge surface created from the regression equations to create 1,000 replicates of the upscaled chloride mass balance estimates of recharge. These 1,000 replicates are summarised using the 5th, 50th and 95th percentiles.

Baseflow separation
Separating the hydrograph at a stream gauge into the quickflow and baseflow components gives a simple method of investigating groundwater discharge. Baseflow is only one component of groundwater discharge and so will be less than the total groundwater discharge. The other common components of groundwater discharge are evapotranspiration from groundwater, extraction of groundwater via pumping or interaquifer flow. On a suitably long-time frame, recharge will equal discharge and therefore baseflow must be less than recharge.
The baseflow separation method used here is the digital recursive filter suggested by Lynne and Hollick (1979), although it has no physical basis, it is the most commonly applied method of baseflow separation in Australia (Grayson et al. 1996). The baseflow separation was undertaken using the Basejumper program (Murphy et al. 2008) with a filter parameter of 0.925.
Four catchments were selected that are known to have high levels of baseflow and drain large parts of the study area (Drysdale et al. 2002;Yin Foo and Matthews 2001). The selected gauges were the first gauge downstream of the known groundwater discharge zones (Fig. 3a): Flora River upstream of Stoney Creek (G8140205); Roper River at Elsey Homestead (G9030013); Gregory River at Riversleigh no.2 (912105A); and, Lawn Hill Creek at Lawn Hill no.2 (912103A). The time period chosen for analysis was the period 2001 to 2018 to be consistent with the available data for the remotely sensed actual evapotranspiration (see section 'Water balance using remotely sensed evapotranspiration').
To allow the comparison between the baseflow and recharge, the volumetric baseflow needs to be divided by the area to get an areal average baseflow. The area of a surface-water catchment is readily calculated from a digital elevation model (DEM), but in Fig. 3 a Catchments selected for baseflow separation estimates of groundwater discharge showing the topographic catchment area (SW catchment) and the assumed groundwater catchment area (shown in colour). b Internally draining catchments used to estimate recharge from a water balance with the proportion of time that standing water is recorded in Water observations From Space (WoFS) data. Inset shows location within the study area a landscape as flat as the study area, groundwater flow does not necessarily respect topographic boundaries. Previous work has created potentiometric surfaces of the study area based on the sparse observations of groundwater level and interpreted groundwater flow directions (Tickell 2003;Fulton and Knapton 2015;Tickell and Bruwer 2017;Knapton 2010); this work has been used to estimate the groundwater catchment areas for each gauge shown in Fig. 3a.
The uncertainty on the areal estimates of baseflow has not been quantified but is necessary for the constraining of the recharge estimates. The area used as the groundwater catchments may be too large leading to an underestimate of the baseflow. There are springs in the Victoria River catchment (Fig. 1) within the study area suggesting that at least some of the area assumed to be feeding the springs in the Flora River is actually flowing into the Victoria River, similarly there are springs in the Calvert, Robinson and McArthur river catchments ( Fig. 1) within the study area within the area assumed to have the groundwater flowing into the Roper catchment. On the other hand, baseflow separation tends to result in higher baseflow indices for larger catchments (Petheram et al. 2008) which may be due to attenuation of the quickflow signal; the smallest of the catchments used here is over 3,000 km 2 . A case can be made for why the estimated baseflow calculated here could be either over-or under-estimated. This study assumed that the error could be up to ±30% based on studies in other regions (Coxon et al. 2015;Petersen-Øverleir et al. 2009).

Water balance using remotely sensed evapotranspiration
A water balance using remotely sensed actual evapotranspiration (AET) can be used to estimate recharge (Szilagyi et al. 2011) but only where the runoff component of the water balance is insignificant (Crosbie et al. 2015;Swaffer et al. 2020). If the runoff component cannot be ignored, then rainfall minus AET will give an estimate of excess water (EW): runoff plus recharge.
The CMRSET algorithm for AET ) was used with MODIS data to create an 8-day time series at a 250-m resolution over the study area. These data were aggregated to a long-term average for the period 2001-2018; this was the longest period of complete years that was available from the MODIS data. The long-term average AET was subtracted from the long-term average rainfall over the same period sourced from the Bureau of Meteorology's gridded product (Jones et al. 2009) to create a long-term average excess-water data layer.
The uncertainty in the CMRSET estimates of AET has not been evaluated for the study area but is necessary for the constraining of the recharge estimates. Based on an evaluation of CMRSET water balances against runoff from stream gauges ) and reviews of remotely sensed ET (Kalma et al. 2008;Glenn et al. 2011), an error of up to ±30% has been assumed.
The long-term average excess water was extracted for the four catchments used for the baseflow separation (Fig.  3a) and for four selected internally draining catchments (BoM 2012; Fig. 3b). These four catchments can be split into two groups: the northern catchments have a normal dendritic drainage pattern that has been captured by a sinkhole (or sinkholes); and, the southern catchments have a centripetal drainage pattern and terminate in a topographic depression. The difference between these types of catchments is immediately obvious in the Water observations From Space (WoFS) data (Mueller et al. 2016), the centripetal catchments have standing water for a significant amount of time at their lowest point, whereas the captured dendritic catchments do not (Fig. 3b).

Constraining the CMB recharge
The CMB estimates of net recharge are constrained by the baseflow and the excess water estimates using a rejection sampling approach (Tarantola 2005;Von Neumann 1951). This relies on having a distribution of each of these three quantities.
The baseflow (BF) has an estimated uncertainty of ±30%. Assuming that the uncertainty is normally distributed and using the "six-sigma rule" gives the standard deviation of the baseflow prediction as 10% of the estimated value. This can then be used to estimate a probability distribution which can be randomly sampled from: where z is the randomly selected standard normal deviate and μ i and σ i are the mean and standard deviation of the baseflow for the ith catchment. In this way the same standard normal deviate is used to estimate the baseflow for each of the four catchments for each repetition of the rejection sampling algorithm. Similar to the baseflow, the actual evapotranspiration also has an estimated uncertainty of ±30%. Making the same assumptions as the baseflow the probability distribution of the excess water is: where EW i is the excess water for catchment i. Again, a single standard normal deviate is used to estimate the excess water for each of the eight catchments for each repetition of the rejection sampling algorithm. The average recharge across the eight catchments has been extracted from the 1,000 upscaled replicates of the chloride mass balance. The 1,000 replicates are assumed to represent the entire population and so a probability distribution has not been created. This will underestimate the extreme tails of the distribution, but these are not important as they will be eliminated by the rejection sampling procedure. The calculation of the 5th and 95th percentiles are stable after 1,000 repetitions meaning that this assumption has no bearing on the final constrained recharge estimates.
The rejection sampling algorithm was run 10,000 times with a randomly selected standard normal deviate for the baseflow distribution, a second randomly selected standard normal deviate for the excess water distribution and a randomly selected run number from the 1,000 replicates of upscaled chloride mass balance recharge. A selection was retained in the posterior distribution if the following constraints were met: 1. The upscaled chloride mass balance estimates of recharge were greater than the baseflow estimates for each of the four catchments 2. The upscaled chloride mass balance estimates of recharge were less than the excess water estimates in each of the eight catchments

Point-scale estimates of recharge
There is an adequate spread of bores geographically that covers the rainfall gradient and the range of soils and vegetation (Fig. 2), but it is quite sparse with less than one bore per 100 km 2 and particularly sparse in the Wiso Basin (Fig. 1d). The runoff coefficient is below 0.03 for 50% of the bores and below 0.1 for 80% of the bores (not shown). It is only the high rainfall areas in the north of the study area where the runoff coefficient is above 0.1 and the chloride exported through runoff can be up to 10% of the total chloride deposition. For the majority of the study area the chloride exported through runoff is negligible (but still accounted for in the calculations).
To assess the covariates, the recharge was estimated deterministically from the mean chloride deposition and the chloride concentration of the groundwater (Eq. 1). Figure 4a shows that the recharge has a positive correlation with rainfall that is not a simple linear relationship, a quadratic relationship is used here. The clay content of the top 2 m of the soil has a slight negative correlation with the log recharge (Fig. 4b). The weak relationship with clay content is not surprising as rainfall is the dominant predictor with such a wide range in values. The relationship with NDVI appears to be going the wrong way (Fig. 4c); more dense vegetation should produce lower recharge. However, the NDVI is correlated with rainfall, causing the positive correlation with log recharge without taking the rainfall into account.
When these covariates are used in multiple linear regression to predict the deterministic estimates of recharge using Eq. (4), they are all highly statistically significant (p < 0.001) predictors using a t-test (Table 2). The coefficient for rainfall is positive indicating that increasing rainfall will increase recharge but the coefficient for rainfall squared is negative which acts to flatten the exponential rise in the curve for high rainfall which could otherwise extrapolate into infeasible space (i.e. R > P). The coefficient for clay content is negative  indicating that increasing clay content of the soil decreases recharge. The coefficient for NDVI is negative (in contrast to Fig. 4c) which indicates that increasing vegetation density results in a reduction of recharge. The overall p value for the regression equation is 2.20E-16 with an r 2 of 0.66. The regression equation can only explain 66% of the variance in the point recharge estimates which means there is 34% of the variance that cannot be explained by these covariates. If this variance has some spatial cohesion then regression kriging will reduce some of this variance and outperform upscaling using regression alone.

Upscaling using regression kriging
The coefficients fitted to Eq. (4) for each of the 1,000 replicates are shown in Fig. 5a-e in the form of boxplots. Consistent with the deterministic result, the coefficient for rainfall is positive and the coefficient for rainfall squared is negative (other than some outliers). The clay and NDVI coefficients are both negative in agreement with the deterministic result. The r 2 results are closely grouped around 0.59 (Fig. 5f); this is less than the deterministic result because of the added stochastic noise due to the treatment of chloride exported in Fig. 5 Coefficients used in the regression equations for upscaling the 1,000 replicates (a-e), f the r 2 for each of the 1,000 replicates, and g the relative importance of the four covariates for each of the 1,000 replicates. The line in the centre of the box is the median, the box represents the interquartile range (the 25th-75th percentiles), the whiskers represent the 10th and 90th percentiles and the dots are any data points outside of the 10th and 90th percentiles Fig. 6 a The 50th percentile of the upscaled recharge using regression only and b the 50th percentile of the kriged residuals surface runoff and the bootstrapping to reduce the influence of outliers. The relative importance (Groemping 2006) of each factor is the proportion of the variance that each one explains (Fig. 5g), and it can be seen that the rainfall, rainfall squared and NDVI have more importance to the regression than the clay content of the soil which contributes less than 5% of the variance explained by the regression equation.
The upscaling based on the regression equation only reflects the covariates used and their importance. Figure 6a shows that the dominance of the rainfall gradient on the distribution of recharge with the highest recharge in the north-west and the lowest in the south-east. There is some influence of the spatial patterns from the soil and vegetation layers (Fig. 2).
The semivariogram fitted to the residuals has three parameters: the nugget, sill, and range. The median nugget of 0.05 means that there is still some variance that is not accounted by the regression or the spatial dependence of the residuals; this could be due to subgrid scale variation or measurement errors. The median sill of 0.19 compared to the nugget shows that there is significant spatial dependence of the residuals over a large distance shown by the median range of 95 km.
The kriged residuals show landscape-scale correlation in the recharge estimates that is not being modelled by the regression equation. The spatial distribution of the kriged residuals (Fig. 6b) appears to be random but does have some structure related to information that the regression equation does not have. Areas of positive residuals are indicative of recharge being higher than predicted using the regression equation and areas of negative residuals have lower recharge than that predicted. Some areas of preferential recharge appear to be coincident with the internally draining catchments (Fig. 3b), terminal lakes (Fig. 3b) and outcropping Cambrian Limestone (Fig. 1). As the residuals surface has a spatial average close to 0 (least squares regression has a mean residual of 0), the areas of preferential recharge are balanced by everywhere else with a negative residual.
Adding the residual surface to the regression recharge surface results in the regression kriging surface of upscaled chloride mass balance recharge; the 1,000 replicates are summarised as the 5th, 50th and 95th percentiles (Fig. 7). This shows the same dominant trend as the regression surface shown in Fig. 6a. The recharge generally follows the rainfall gradient with the highest recharge in the north-west and the Fig. 7 The 5th, 50th and 95th percentiles of recharge estimated using the chloride mass balance upscaled using regression kriging. The point data is the 50th percentile of the point estimates of recharge using the chloride mass balance lowest in the south-east. In contrast to the regression surface, the upscaled recharge using regression kriging also shows higher recharge in areas of preferential recharge associated with the sinkholes, terminal lakes and limestone outcrops. The spatial average recharge for the 50th percentile is 12 mm/year with the uncertainty represented by the 5th and 95th percentiles at 5 and 36 mm/year respectively.
The upscaled CMB recharge using regression kriging at a grid cell scale is a good match for the recharge calculated at a point scale for the 50th percentile (Figs. 7a and 8a). The r 2 for the 1,000 replicates is shown in Fig. 8b with a median of 0.79. This is a substantial increase from the r 2 of 0.59 that was found for the upscaling based on regression only.

Baseflow separation
For the four catchments selected (Fig. 3), the surface-water (SW) catchment area was calculated from the topography (Table 3), but the current conceptual model suggests that the groundwater (GW) area contributing to the baseflow is much larger for three of the four catchments. Lawn Hill Creek has the lowest volumetric baseflow, partly due to having by far the smallest catchment area but has the highest areal average baseflow. The neighbouring Gregory River has much higher volumetric baseflow but lower areal average. The Roper River catchment is approximately double the size of the Flora River catchment and this is reflected in the volumetric baseflow, but they have similar areal average baseflow.

Water balance using remotely sensed evapotranspiration
The excess water has been calculated on a 250-m grid across the study area and aggregated to an annual average for the period 2001-2018 (Fig. 9a). Areas in green have rainfall greater than actual evapotranspiration and are exporting excess water either through runoff or recharge. Areas shown in purple are where actual evapotranspiration is greater than rainfall and these areas are receiving excess water either through surface run-on, groundwater discharge or applied water through irrigation.
For the four selected internally draining catchments, the Lake Woods catchment stands out due to the area of Lake Woods itself having an excess water of over −1,000 mm/year (Fig. 9b); this is consistent with it being inundated for a substantial proportion of the time (Fig. 3b). The other three catchments do not have similar large areas of negative excess water. At the whole of catchment scale, all four catchments have a positive excess water, indicating that they are recharge features in the landscape (Table 4). Similarly, all four catchments used for the base flow analysis also have a positive excess water showing that they are exporting water as some combination of recharge and runoff.
The area around Mataranka is a groundwater discharge area for the Roper catchment (Fig. 9c). The baseflow component of the groundwater discharge is easily measured by the stream gauge downstream of the springs but the diffuse discharge via evapotranspiration is not so easily measured. Almost the entire portion of Elsey National Park south of the Roper River has a negative excess water (up to −500 mm/year), indicating that  evapotranspiration is greater than rainfall. This includes the runoff component so the groundwater discharge will be greater than indicated by the excess water.

Constraining the CMB estimates of recharge
The upscaled CMB estimates of recharge were constrained using rejection sampling by having to be greater than the baseflow for all four catchments and less than the excess water in those same four catchments and an additional four internally draining catchments. Of the 10,000 samples tested, 1,389 were retained in the posterior distribution. For each of the eight catchments sampled, the uncertainty of the net recharge was constrained. The 5th percentile was greater in the constrained versus unconstrained distribution and the 95th percentile was reduced (Figs. S2 and S3 of the ESM). Consequently, the interquartile range was reduced. The median of the net recharge increased in the constrained case. The baseflow has been constrained by a small amount with a small reduction in the magnitude of the distribution. In contrast the Fig. 9 a Excess water across the Cambrian Limestone Aquifer; b excess water over the four internally draining catchments used for a water balance; c groundwater discharge in Elsey National Park, both diffuse discharge through ET and localised discharge through springs   Fig. 10 The 5th, 50th and 95th percentile of recharge estimated using the upscaled chloride mass balance constrained by the baseflow and remotely sensed ET measurements constraining process has had a big effect on the excess water with a substantial reduction in the range of values, and the negative values have been eliminated from the posterior distribution. The spatial average recharge for the 50th percentile is 16 mm/year with the uncertainty represented by the 5th and 95th percentiles at 9 and 31 mm/year, respectively (Fig. 10). The range between the 5th and 95th percentiles has decreased from 31 mm/year for the unconstrained case to 22 mm/year for the constrained case, demonstrating the value in constraining the upscaled CMB estimates of recharge with the baseflow and excess water observations. When further broken down to regions, the recharge (Fig. 10) is following the rainfall (Fig. 2b), and similar reductions in the range between the 5th and 95th percentiles are seen for the constrained case versus the unconstrained case (Table 5). For the geological basins the Daly Basin has the highest recharge at 82 (49-156) mm/year, the Wiso has 17 (9-31) mm/year and the Georgina has the lowest at 9 (5-18) mm/year (Table 5). For the surface-water catchments those in the north have the highest recharge (Daly, Roper and Victoria rivers) and those in the south have the least (Georgina River and Barkly region).

Assessment of the assumptions made in the methodology
One of the objectives of this study was to thoroughly assess the assumptions made in the methodology to determine their impact on the uncertainty of the recharge estimates. Being transparent about the assumptions made, and the limitations of the analysis, provides confidence that the recharge estimates are robust (Peeters 2017). Wood (1999) listed the assumptions made in estimating recharge using CMB as: (1) chloride in groundwater originates from rainfall on the aquifer and not from flow from underlying or overlying aquifers; (2) chloride is conservative in the system; (3) steady-state conditions are assumed in that the fluxes of chloride and water have not changed over time; and (4) there is no recycling of chloride within the aquifer. Assumptions 1, 2 and 4 can generally be met but assumption 3 is problematic in regional-scale applications of the CMB.

Point estimates of recharge
The chloride deposition used here from Davies and Crosbie et al. (2018) includes the uncertainty in the measurements and the upscaling but is reliant on the measurements being representative of the average chloride deposition over potentially thousands of years. The chloride deposition measurements used by Davies and Crosbie et al. (2018) were bulk rainfall samples that include both the wet deposition in rainfall and the dry deposition of dust fallout. The study area has a reasonable density of measurements for the chloride deposition, but repeated sampling has shown the uncertainty in assuming a long-term average from 1 to 5 years of data. The chloride deposition at Katherine has been measured in five studies over the past 60 years with a range from 2.46 to 7.30 kg/ha/ year (Wetselaar and Hutton 1963;Galloway et al. 1982;Likens et al. 1987;Keywood et al. 1997;Wilson et al. 2006). The dry deposition of chloride can be problematic for the CMB if there is a local source of salt in the landscape, e.g. recirculation of salt from salt lakes. The internally draining catchments in the study area result in terminal lakes that are recharge features of the landscape; they are not, however, the salt lakes that are common across southern Australia that result in local recirculation of chloride. If the chloride samples of the groundwater were sampled from just below the water table at the top of the water column, then this would be recently recharged water and are valid for use in the CMB with recent estimates of the chloride deposition due to rainfall. If the groundwater samples were well mixed from the entire water column of the unconfined aquifer, then the chloride concentration is an average over the residence time of the water in the aquifer. In high-recharge, short-flowpath areas in the north of the CLA, the residence time may be on the scale of decades to centuries and the assumption of the chloride deposition being in steady state is reasonable. For the low-recharge, long-flow-path arid areas in the south of the CLA, the residence time of the water may be thousands or tens of thousands of years, and the assumption that measurements of chloride deposition over the last 60 years are applicable on this time scale is questionable. The assumption of steady-state chloride deposition is a source of unquantified uncertainty in the recharge estimates made here.
Every recharge review has demonstrated that recharge increases with rainfall, is greater in lighter textured soils than clays and has more recharge with sparse vegetation (Scanlon et al. 2006;Crosbie et al. 2010;Kim and Jackson 2012). The regression equation coefficients when fitted to Eq.(5) agree with the previous work collated by these reviews. The clay and NDVI coefficients are negative throughout their range (Fig. 5d,e) indicating that for a given rainfall, recharge decreases with increasing clay content or NDVI. The quadratic relationship fitted to the rainfall is successful in not letting the relationship extrapolate into an infeasible range (R > P) but not being a monotonic relationship can lead to problems. For the deterministic example shown (Fig. 4), the relationship has a minimum at 175 mm/year of rainfall with 0.59 mm/year of recharge, and the lowest rainfall point is 154 mm/year with a slightly higher 0.63 mm/year of recharge. The reason for this contradiction is the position of the inflection point in the fitted quadratic equation. This difference is negligible in the context of estimating recharge across the entire CLA but it would be more conceptually correct to use a monotonically increasing function to avoid this problem.
The regression equations were able to explain~59% of the variance in the point recharge estimates using three covariates. If more variables were built into the regression model the amount of variance explained may be able to be increased. Fu et al. (2019) evaluated 88 covariates before finding an optimum 10-parameter regression model to estimate recharge. Both Barron et al. (2012) and Fu et al. (2019) found that annual average rainfall may not be the best predictor of recharge; seasonal rainfall, rainfall intensity and number of wet days were all found to be important. Similarly, more nuanced correlations with vegetation may be able to be attained with more ecohydrologically relevant parameters than NDVI. Some examples could include: proportion of persistent and recurrent vegetation as a surrogate for perenniality; vegetation height as a surrogate of rooting depth; and, proportion of bare ground. The clay content of the soils was the least influential of the covariates used in the upscaling (Fig. 5g). More hydrologically relevant soil parameters could be included such as available water capacity and hydraulic conductivity. However, as these would probably be derived from pedotransfer functions, the percentage of sand, silt and clay may see an improvement over just the percentage of clay used here.
Adding more parameters into the regression equation may increase the variance explained by the model for each replicate but would do little to decrease the uncertainty in the recharge estimates overall. The difference between the 10th and 90th percentile of the coefficient for the intercept in the regression equation (Fig. 5a) is 0.7, this is 70% of an order of magnitude or a factor of 5. This coefficient is directly related to the standard normal deviate used to generate the chloride deposition (Eqs. 3 and 4). This uncertainty in the chloride deposition transfers linearly to the uncertainty in the recharge estimates (Eq. 1), the best way to decrease the uncertainty in the recharge estimates from the chloride mass balance would be to have a long-term monitoring program for rainfall chemistry like in other parts of the world (Lamb and Bowersox 2000).

Upscaling process
The upscaling process using regression kriging allows the point recharge estimates to be upscaled to a regular grid allowing recharge estimates to be made across the entirety of the CLA. However, the uncertainty in the recharge estimates increases with increasing distance from the point data sources. This is particularly evident in the difference between the 5th and 95th percentiles of constrained recharge in the data sparse region of the southern Wiso Basin (Fig. 10).
The kriging of the residuals (Fig. 6b) showed some spatial structure indicating information that is present in the pointscale data that is not captured in the covariates used in the regression equations. There are areas of positive residuals associated with the internally draining catchments to the north of the Beetaloo Sub-basin (Fig. 3b) and the terminal lakes such as Lake Woods and Tarrabool Lake (Fig. 1b). Other areas with positive residuals include areas where the surface geology is of Cambrian age (Fig. 1b).
Yin Foo and Matthews (2001) reported that landholders had observed sinkholes capturing overland flow during runoff events to several metres deep, but many only hold water for a few hours. They considered point recharge through sinkholes to be the major source of recharge on the Sturt Plateau (Yin Foo and Matthews 2001). Geological mapping in the 1960s and 1970s identified some sinkholes but it is not known how extensively they were mapped, Evans et al. (2020) collated the known areas of extensive sinkhole development around the Beetaloo Sub-basin (Fig. 3b), but there may be others. The major regional-scale karstic features are well known (e.g. Kutta Kutta caves in the Daly catchment or Camooweal Caves in the Georgina catchment) but the local submetre-scale karstic features may be important for recharge but not mapped (Jolly 2009). The two catchments used here where streamflow has been captured by sinkholes are extreme examples on a regional scale rather than the small-scale features that may occur more frequently. Catchment areas from a few hectares to a few square kilometres are responsible for cave formation in the Georgina Basin via preferential recharge paths (Eberhard 2003). The areas with positive residuals coinciding with the Cambrian age surface geology is probably an indication of preferential recharge through karstic features, although this is not exclusive to the outcropping limestone as sinkholes have developed though overlying Cretaceous sediments (Randal 1973). The areas shown in Fig. 6b as having a positive residual and therefore enhanced recharge are interpolated from the point-scale bore data without any information related to karstic features, while the recharge aggregated to catchment scale is probably reasonable as it will not be correct at the point scale of each sinkhole.
The major terminal lakes such as Lake Woods, Tarrabool Lake and Sylvester Lake ( Fig. 1) have been previously identified as recharge features in the landscape (Verma and Jolly 1992;Yin Foo and Matthews 2001), but it is only Lake Woods that has been studied in detail (de Caritat et al. 2019). The recharge rate through the lake beds has not been quantified previously and the values shown here (Fig. 10) are spatially attenuated as the upscaling process does not know where the lakes are. The positive residuals around the lakes (especially Lake Woods and Tarrabool Lake, Fig. 6b) have been interpolated from point data at the bores and so do not match the outline of the lakes themselves. The aggregated recharge calculated here at the catchment scale is probably reasonable but will be underestimated at the pixel scale within the lakebed.

Constraining the CMB estimates of recharge
Constraining of the recharge using the baseflow and the remotely sensed evapotranspiration resulted in a 29% reduction in the range between the 5th and 95th percentiles (from 31 to 22 mm/year). From Fig. S2 of the ESM, it can be seen that the lower quartile of the recharge has been increased and the upper quartile of the baseflow has been reduced. The rejection sampling method has jointly constrained both the recharge and the baseflow. Similarly, the upper quartile of the recharge has been reduced (in most cases) and the lower quartile of the excess water has increased (Figs. S2 and S3 of the ESM). In jointly constraining both the recharge and excess water, all the negative values of excess water have been excluded as all eight catchments are exporting water either through the groundwater or surface water.
Of the individual tests used in the rejection sampling, the baseflow in the Flora River provided no constraint as all 10,000 samples passed. The other catchments ranged from 5,402 samples passed in the Roper River to 6,386 passed in the Gregory River. This could be an indication that the area assumed for the Flora River's groundwater catchment is too large or that the baseflow is too small for the area it is applied to. The springs in the Victoria River catchment (Fig. 1a) would suggest that at least part of the groundwater flow assumed to discharge to the Flora River is actually discharging to the Victoria River, thus the area is too big (Fig. 3). It is also conceivable that some of the groundwater flow could flow under the Flora River and discharge to springs further down the Daly River (Fig. 1a).

Comparison to previous recharge estimates
Previous recharge estimates are quite similar to those produced here (Table 1 and cf. Table 5) and generally follow the rainfall gradient with higher recharge in the north and lower in the south. The previous recharge estimates have used a variety of techniques in a variety of different ways. No attempt has been made to assess the validity of the previous recharge estimates.
In the Daly catchment, Jolly (1984) found an average recharge rate of 100 mm/year and later 90 mm/year (Jolly 2002) which compares well to the 101 (61-192) mm/year found here. In areas with Cretaceous cover the recharge could be half of this catchment average (40-50 mm/year) and up to double in areas without Cretaceous cover (100-190 mm/year;Jolly 1984Jolly , 2002. At a grid cell scale for the 50th percentile, the range found here was <1 to >300 mm/year which is comparable to the range identified in local scale studies (Wilson et al. 2006).
In the Roper catchment, Bruwer and Tickell (2015) found recharge to be 44 mm/year near Mataranka which is similar to the catchment average found here of 49 (27-93) mm/year. The recharge found by Crosbie et al. (2009) is considerably higher as it encompasses the entire Roper catchment which includes higher-rainfall areas outside of the CLA.
Yin Foo and Matthews (2001) found that recharge was in the range of 6-18 mm/year in the lower-recharge areas of the Sturt Plateau, similar to the 3-13 mm/year found by Jolly et al. (2004) in the northern Wiso Basin and the 2-8 mm/year found by Deslandes et al. (2019) for the same area. These studies are in similar areas to the Beetaloo Sub-basin East used here, that was found to have an average recharge of 12 (7-23) mm/year. The western part of the Beetaloo Sub-basin was found to have a higher average recharge of 48 (27-92) mm/year in this study which is a similar trend to the higher point recharge estimates of 5-70 mm/year (Jolly et al. 2004) and66-190 (Deslandes et al. 2019) found previously.
In the Barkly region, Tickell and Bruwer (2017) assumed that recharge was negligible where there was Cretaceous cover and up to 12 mm/year where the Gum Ridge Fm outcrops. The recharge estimated here is also higher in the three areas that Tickell and Bruwer (2017) identified as recharge areas but the recharge under the black soil plains (high clay content on Fig. 2c) was found to be in the range 1-3 mm/year rather than being negligible. Overall, the recharge found here for the Barkly region was 9 (5-17) mm/year.
In the north of the Georgina catchment, Read (2003) estimated the recharge as 2-6 mm/year which is not dissimilar to the 6 (4-12) mm/year estimated in this study for the entire catchment. There have not been previous estimates of recharge in the southern half of the Wiso Basin, the southwest of the Georgina Basin in NT or the Qld portion of the Georgina Basin. The previous recharge estimates in the Limmen Bight, McArthur, Robinson, Calvert and Nicholson River catchments have been made at the whole of catchment scale (Crosbie et al. 2009) and so are higher than those produced here, as the rainfall increases toward the coast outside of the CLA; they are not directly comparable to the recharge estimates made here.

Conclusions
The uncertainty in the recharge estimates using the CMB are largely due to the uncertainty in the chloride deposition, and it is difficult to measure this over appropriate time scales with the spatial resolution needed. While better input data would be nice to have, this report presented an alternate way of reducing the uncertainty in the recharge estimates from the chloride mass balance by developing a method to constrain the uncertain recharge estimates using complementary data sources that are much easier to obtain. Streamflow is routinely monitored in many places and baseflow can be extracted from the streamflow hydrograph by many different methods. Similarly, remotely sensed evapotranspiration is available globally at a variety of spatial and temporal resolutions that are appropriate for use in estimating recharge. The rejection sampling method was successful in reducing the uncertainty in the recharge estimates by nearly one third for the application used here across northern Australia (with the uncertainty quantified as the difference between the 5th and 95th percentiles).
The method was successfully applied to the entirety of the Cambrian Limestone aquifer across northern Australia. The average recharge across the 570,000 km 2 was found to be 16 mm/year for the median case with a 5th percentile of 9 mm/year and a 95th percentile of 31 mm/year. The recharge largely followed the rainfall gradient with the highest recharge in the Daly catchment in the north-west with a catchment average of 101  mm/year and the lowest in the Georgina catchment in the south-east with 6 (4-12) mm/year. The spatial distribution had considerable heterogeneity, related to soils, vegetation, surface geology and also focused sources of recharge related to terminal lakes and karstic features.
The recharge estimates produced here are appropriate for regional-scale water resources assessment but do not identify local point sources of recharge that require protection from contamination. If this region is to be further developed (for instance for the gas industry or more intensive agricultural activities), then consideration should be given to identifying areas such as sinkholes that act as preferential recharge sources, and determining appropriate strategies to manage potential contamination sources at the surface. The input datasets, code and outputs associated with this report are available from (Geological and Bioregional Assessment Program 2020).