A method for reconstructing past lake water phosphorus concentrations using sediment geochemical records

An existing steady state model of lake phosphorus (P) budgets has been adapted to allow reconstruction of long-term average historic lake water total phosphorus (TP) concentrations using lake sediment records of P burial. This model can be applied without site-specific parameterisation, thus potentially having universal application. In principle, it is applicable at any site where there is both a sediment P burial record and knowledge of the current water budget, although we advise caution applying it to problematic sediment records. Tested at six published case study sites, modelled lake water TP concentrations agree well with water-quality monitoring data, and limited testing finds good agreement with wholly independent diatom inferred lake water TP. Our findings, together with a review of the literature, suggest that well preserved lake sediments can usefully record a long-term average P burial rate from which the long-term mean lake water TP can be reliably estimated. These lake water TP reconstructions can provide meaningful site-specific reference values to support decision making in lake eutrophication management, including establishing targets for lake restoration.


Introduction
Excess phosphorus (P) loading and the resulting eutrophication of waterbodies is a problem that affects aquatic ecosystems globally. Target-driven management of these sensitive lake ecosystems typically aims to recover the system to an undisturbed state, which requires some quantification of pre-disturbance nutrient levels. Most monitoring records of lake water quality do not cover pre-disturbance periods and therefore cannot provide this nutrient baseline. Historic lake water P levels must therefore be reconstructed, enabling the timing and magnitude of recent anthropogenic impacts on lake water quality to be quantified in relation to longer term changes.
Lake sediment records provide a unique archive of long-term change, including recent disturbances and natural baselines (Battarbee 1999). They can be used to determine site-specific targets for nutrient concentrations or evaluate the achievability of existing targets (Cardoso et al. 2007;Bennion et al. 2011;Sayer et al. 2012). Current approaches use palaeoecological records as indicators of change, where transfer functions can turn a microfossil assemblage into a record of past water quality, with diatom records the most common way of reconstructing lake water total phosphorus (TP) (Cumming et al. 2015). This palaeoecological approach requires a time and resource-intensive tailored training set, and has the disadvantage that microfossil records do not preserve in all lakes. Furthermore, the validity of the transfer function approach has been questioned on theoretical grounds (Juggins 2013).
Mass balance methods based on sediment geochemical P records offer an alternative approach that is wholly independent of the palaeoecological methods, and potentially less costly. Moss (1980) applied this concept to estimate past lake water TP with apparent success at Barton Broad (Norfolk, UK), and three subsequent studies further explore application of mass balance models to sediment P data (Dillon and Evans 1993;Rippey and Anderson 1996;Jordan et al. 2001) though they do not actually infer lake water TP. Despite these successes, the approach has not been fully developed and no subsequent attempts to apply it have been published. Several issues may have discouraged further work. Doubts over sediment P record reliability persist, even though there are many examples of successful use of P records in palaeolimnological analysis (Engstrom and Wright 1984). Furthermore, the observation that surface sediment concentrations of P are often uncorrelated with TP concentrations in the overlying lake water has encouraged a belief that that sediment P profiles do not record reliable information about past lake water TP (Engstrom and Wright 1984;Ginn et al. 2012), and by implication this doubt appears to have been transferred to studies of P burial flux. In addition to these issues, it is also possible that model formulation issues and lack of wider model validation may have discouraged replication and wider application.
Here we show that lake water TP concentration should correlate not with sediment P concentration, but with P burial flux, provided long-term averages are considered. We present a geochemical method for reconstructing long-term average water TP concentrations based on lake sediment P burial fluxes, an approach which in principle is universally applicable at sites with well-preserved sediment records. The conditions required for meaningful application of the model are described in the discussion section of the paper. Our method, building on the earlier attempt of Moss (1980), takes a simple mass balance approach used widely in limnology, but reframes it from the palaeolimnological perspective. We show how a sediment P mass balance approach can be used to reconstruct lake water TP at six case study lakes of varying hydrological and morphometric character, in each case comparing lake water TP inferred from published sediment P records and with monitored data.

Modelling context
Nutrient loading and simple mass balance models The first simple mass balance model concerning total P fluxes in lakes was developed by Vollenweider (1968Vollenweider ( , 1975 after an extensive study into the causes of eutrophication. This model expresses the P budget dynamics of a water body as a balance between competing supplies and losses of P: dP dt ¼ P supply À P lost to sediment À P lost to outflow where, P supply refers to all externally derived P via inflow and atmospheric deposition, plus P supplied from temporary stores within the lake; P lost to sediment refers to all P transported to the sediment via particle settling and permanently retained by it; and P lost to outflow refers to all P exported in outflowing water, including seepage. At steady state, whereby internal P stores (such as water column and biomass) can be neglected, this equation simplifies to the form in which the Vollenweider model is best known: P supply ¼ P lost to sediment þ P lost to outflow ð2Þ The Vollenweider steady-state model, and subsequent developments of it, focus on water chemistry, using hydraulic and morphometric information on individual lakes to predict lake water TP. These models rely on the assumption that the lake is well mixed and at steady state. The early development of these models is summarised by Rast and Thornton (2005) and several studies have reviewed their application (Nürnberg 1984;Ahlgren et al. 1988;Brett and Benjamin 2008). We show below that because the Vollenweider model considers all P in the system, it provides a framework for modelling the total concentration of P in lake sediments.
A note on units and notation The fluxes and pathways referred to in this study can be measured, calculated and represented in different ways, and have been used interchangeably in previous studies with no overarching standardisation of either notation or units. In this study we have largely adopted the scheme presented by Brett and Benjamin (2008) with any new notation conforming to their approach. We also retain some of the more established hydrological notation conventions. Table 1 shows all the parameters used in this study, with their units and definitions. For clarity we separate sediment and lake water phosphorus; using TP lake (mass per unit volume) for any measure of lake water concentration, and P (mass per mass) for sediment total phosphorus concentration. Most importantly we express all fluxes as lake loading (L), where loading refers to P flux (mass per unit time) normalised to lake area, giving units of mass per unit area per unit time (generally, mg P m -2 yr -1 ). This intensive unit has the advantage over extensive units for flux of being scale independent, and thus more easily comparable between lakes.
The steady state Vollenweider model applies to values that are averaged over time periods long enough to smooth out variations arising from the dynamics of internal P stores, an averaging period that is not precisely defined. We refer to such averaged values simply as long term, where we take long term to mean [ [ annual (i.e. at least multi-annual to decadal scales). Conversely, we use short term to refer to instantaneous value and averages over shorter time periods.
A number of algebraically equivalent terms have been employed in previous formulations of the Vollenweider model. To help avoid confusion two common combinations of variables are compared here: Areal sediment mass accumulation rate, i.e., MAR flux /A C MAR core g m -2 yr -1 MAR as measured for a specific core MAR flux g yr -1 Sediment mass flux to whole lake L in mg m -2 yr -1 Areal P loading to the lake system (inc. water inflows and atmospheric deposition) L out mg m -2 yr -1 Areal P loading from the lake system (all P exported from lake via outflow and seepage) L sed mg m -2 yr -1 Areal P loading to lake bed (all P exported from lake via outflow and seepage) 1. The apparent settling velocity, v (m yr -1 ), is equivalent to the first order sedimentation coefficient, r (yr -1 ), if mean lake depth, z (m), is allowed for. Thus: 2. The areal water loading, q s (m yr -1 ), is equivalent to the water flushing rate, q (yr -1 ), if lake mean depth, z (m), is allowed for. Thus Consequently, the expressions in Eq. 5 are fully equivalent.
In this study we use v and q s instead of z, r, and q as it has been shown that areal forms of these coefficients are better predictors of lake condition than the volumetric forms (Kirchner and Dillon 1975). Note that areal water loading, q s , can be calculated in different ways depending upon what measured data are available. See Table 2 in the methods section for procedures.
In keeping with previous analysis (Vollenweider 1975;Brett and Benjamin 2008), we assume: • The lake is well-mixed such that lake water TP (TP lake ) has the same value as the outflow TP concentration (TP out ). • The areal water loading (q s ) is the same for the inflow and outflow. This effectively assumes that rainfall receipt and evaporative loss is the same for the lake body as for the catchment.
The equations we present in this study are valid if these conditions are met. The well-mixed lake assumption is uncontroversial, but neglecting enhanced evaporative loss from the lake surface will not always be appropriate. The equations can be modified to take this into account. We also generally assume that the area of lake bed that accumulates fine sediment (A acc ) is equal to the whole lake area (A L ). Again, the equations can be modified where this is not the case.

Phosphorus retention
Simple mass balance models can be expressed in terms of a phosphorus retention coefficient (R P ), which is defined as the proportion of externally-derived P that is retained by the lake (i.e. lost to the sediment) and has not left the lake in outflowing water (including seepage to groundwater). For a lake at steady state (Eq. 2), R P is given by: In simple mass balance models R P is used to estimate inflow supply and/or loss from the hydrological system. R P can either be determined from longterm (multi annual) nutrient budgets or predicted from lake hydrological and morphometric characteristics. The first approach requires sufficient monitoring data to get representative values. In many cases, data sets are insufficient or unavailable, in which case the predictive methods must be used.
Estimating R P using inflow/outflow budgets Lake P retention can be calculated by direct measurement of inflow and outflow P loading where: The P Loading values are calculated from the measured water TP concentration and the water loading, exemplified for L out : Provided the data are reliable, direct measurement of present-day conditions will produce values that best capture the nutrient budget of the site. Long periods of monitoring are required to obtain representative data (Dillon and Evans 1993) as short term or infrequent measurements may capture exceptional events or include seasonal bias. Some sites will be difficult to quantify fully as not all inputs and outputs will be directly or easily measurable; site access may be problematic, and conducting reliable monitoring is time and resource intensive.
Estimating R P using hydrological models Lake P retention (R P ) can also be determined by using a number of methods calibrated using data sets of hydrologically measured R P , where R P is modelled using hydraulic and morphometric information on the individual lakes. The first of these approaches, developed by Vollenweider (1975), is based on the steady state model. The Vollenweider continuity equation (Eq. 2) can be expressed as: where v is the apparent P settling velocity. Here, TP lake Â v represents P loading to the sediment, while TP lake Â q s represents the outflow P loading. Combining Eq. 7 with Eq. 9, R P is given by: The value for v can be found by optimisation using a data set of measured water mass balances (Chapra and Dolan 2012). Using this approach, Vollenweider (1975) selected a value of v = 10 based on a study of predominantly Swiss lakes. There appears to be no ideal universal v value; Vollenweider (1975) observed that v = 10 did not transfer well to the Laurentian Great Lakes and subsequent studies have suggested  (Kopáček et al. 2004) L sed (Kopáček et al. 2004) Further details are given site by site in the Methods section *Environment Agency water quality data from the Water Quality Archive (Beta) other optimum values for v based on analyses of different datasets. The second approach is that of Kirchner and Dillon (1975), an empirical model based on q s . They found the relationship between measured R P and q s values from 15 North American lakes could be described by a double exponential fit: The final method is based on water residence time (s) (Larsen and Mercier 1976) (or its inverse, q the flushing rate) giving: These methods are well established in limnology and the concepts widely used in lake modelling. The particular benefit of the predictive modelling approach is in providing values estimated from large independent data sets. Furthermore, the parameters needed are easily obtainable.

Method
A new method of estimating Rp using sediment P loading An alternative and novel approach can be taken by reframing the Vollenweider mass balance model from a sedimentary perspective. Rather than considering P loading to the sediment as a loss, we can think of it as a record of changing P load to the system that can be measured directly. We can then estimate R P using lake sediment core P measurements. As L sed = L in -L out , substitution of L sed into Eq. 7 gives: Quantification of L sed requires sediment P concentrations and average sediment mass accumulation rates, and therefore a well dated record is necessary. It is possible to substitute L in for L out in Eq. 13, however L in is typically more difficult to measure reliably (Dillon and Evans 1993). The potential for deriving R P from the sediment P record and L in was briefly alluded to by Dillon and Evans (1993), although it was not shown mathematically or developed further as their focus was on L sed .
Using sediment record to infer past lake water TP Expressed to include P loading to the sediment (L sed ) the Vollenweider model has a direct application to palaeolimnological reconstruction as it becomes possible to calculate changes in lake inflow P loading through time. If a dated record of P loading exists and there is an applicable method of calculating R P for the site, the mass balance approach can be used to infer past water TP lake concentrations from the sediment record by combining Eqs. 8, 9 and 13 to give: While the simplest form of this expression is given by the final term of the equation, v is generally not estimated independently; though an approach for doing this was proposed by Binford and Brenner (1986), it has not been applied for this purpose. Consequently, we most often calculate TP lake from L sed , R P and q s .
It must be stressed that for the purpose of this study we assume that R P is constant through time, although strictly speaking we would expect it to vary with hydrology, and perhaps with trophic status (Nürnberg 1984(Nürnberg , 2009). In principle, given an estimated hydrological history, for example Hadley Centre climate model hind cast values, HadCM3 GCM (Gordon et al. 2000;Pope et al. 2000), a time specific R P can be calculated. In practice, the changes are small for the timescales reconstructed in this study and would be based on uncertain, low resolution data so we have not attempted this.
Note that L sed (equivalent to P lost to sediment from Eq. 2) refers to net sediment P burial, incorporating gross burial, resuspension, and diffusive internal loading into a single net burial term. The model also applies to total sediment P rather than specific sediment P fractions, a decision necessitated by the limnological mass balance which is expressed in terms of TP. Lake-wide mean P burial from sediment core data Equations 13 and 14 depend on the lake-wide mean long-term net P burial rate (L sed ). Individual sediment cores provide measured net P burial rates for specific locations in the lake, but these differ from the lakewide mean value. Sediment core mass accumulation rates (MAR Core ) for deep coring locations generally overestimate the lake-wide mean sediment mass accumulation (MAR) rate because (1) near-shore areas and steep slopes within the lake fail to accumulate lake sediment, and (2) even within the sediment accumulating area (A SAA ), the mass accumulation rate varies with water depth due to sediment focussing (Hilton et al. 1986;Blais and Kalff 1995;Rippey and Anderson 1996). This procedure is further described in the discussion. For each of our case study sites, we make the simplifying assumption that A SAA = A L (except for Lake Erie and Lake Ontario, where the data sources compensate for this effect), and use the model of Håkanson (2003), using mean lake depth (z) and core depth (z core ). In this study MAR is calculated as: Study sites This study uses published data from lakes with both sediment P flux records and water quality monitoring data. We provide worked examples of our sediment mass balance approach and compare our reconstructed TP records to monitored lake water values. Six lakes were found to have sufficient information: Lake Annecy, Lake Erie, Hatchmere, Loweswater, Lake Ontario and Lake Plešné. Sources for the published data are shown in Table 2, and information about coring locations and lake properties are in Table 3. Although these lakes differ in characteristics, they are not representative of all lake types as choice was limited by availability of published information. The reported hydrological and morphological information available was found to vary between the publications. The data and calculations are thus described on a sitespecific basis below, with further information and formulations in Table 2.

Lake Annecy
To calculate L sed at Lake Annecy a single sediment core P burial rate (L Core ) was obtained from the MAR Core and P data of Loizeau et al. (2001), which was adjusted using Eq. 15 using coring depth (Loizeau et al. 2001). Mean lake depth was calculated from the lake volume and area data of Perga et al. (2010). This mean depth value and the water residence time (Perga et al. 2010) were used to calculate q s . The lake water TP data of Perga et al. (2010) are compared with the sediment inferred TP lake values. For calculation of R P (sed) (Eq. 13), L sed was from the uppermost sediment interval, and L out based on TP lake averaged across 1971-2006.
Lake PlesˇneÁ t Lake Plešné, L sed for a Holocene sediment core used data from Norton et al. (2016) for P and MAR core (electronic copies provided by the Jiří Kopáček and Josef Veselý). Average values for lake water TP corresponding with the sediment record are also taken from Kopáček et al. (2006). The value of q s was obtained from runoff (R) (Kopáček et al. 2006) and area data for the lake and catchment (Kopáček et al. 2004). The monitored TP concentration data (TP lake ) are from Č tvrtlíková et al. (2016). For calculation of R P (sed) (Eq. 13), L sed was measured by Kopáček et al. (2004) using sediment traps, and L out based on the TP lake average of Kopáček et al. (2006).

Loweswater
The Loweswater data are largely from a single report (Goldsmith pers. commun.), L sed is based on data from four 210 Pb dated cores, adjusted using Eq. 15 and then averaged. For calculation of R P (sed) (Eq. 13), L sed was taken directly from Goldsmith (pers. commun.) and TP lake is the average of Environment Agency monthly monitoring data (2009)(2010)(2011)(2012)(2013)(2014).

Lake Ontario
To calculate L sed , the basin total sediment accumulation (MAR flux ) (Kemp and Harper 1976) was multiplied by the recent sediment P concentration (P) of two cores (Schelske et al. 2006) and then divided by A L . To reconstruct long-term inferred TP lake , MAR and P for the two short cores (Schelske et al. 1988) were used to obtain two L core records. These were scaled up to the whole basin using an estimate of the basin-wide P burial rate (using MAR flux of Kemp and Harper (1976) over the period 1850/1930 to 1970), and the average P from the overlapping period (1970)(1971)(1972)(1973)(1974)(1975)(1976)(1977)(1978)(1979)(1980). For calculation of R P (sed) (Eq. 13), L sed is the value described above and TP lake is the average across 1970-1980 from Chapra and Dolan (2012).

Lake Erie
Mean L core was calculated by taking an area-weighted average of across the three main sedimentation basins of Lake Erie, from the six cores of Williams, Murphy and Mayer (1976). The L core values were found by multiplying MAR ) by surface sediment P. L sed was found from mean L core using the ratio of sediment accumulation area to lake total area. For the long record, the sediment core P records for Stations 1, 2, 3, and 6 of Williams, Murphy and Mayer (1976) were scaled using mean L sed for the most recent samples (ca. 1970), effectively assuming constant mass accumulation rate. This constrains the most recent L sed value to be equal to the recent mean, but allows older values for samples to vary freely. For calculation of R P (sed) (Eq. 13), L sed is the value described above and TP lake is the average across 1969-1971 from Chapra and Dolan (2012).

Hatchmere
Recent L sed for Hatchmere was calculated from the published P sediment core record of Boyle et al. (2015). For each sediment interval for the core dating between 1990 and 2011 (to reduce the impact of any surface P enrichment), MAR core was multiplied by the corresponding P to obtain a series of L core values. These were averaged and converted to L sed . For calculation of R P (sed) (Eq. 13), L sed is as described above, averaged 1990-2011 and TP lake is the average of Environment Agency monthly monitoring data (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015).
Reconstructing TP R P values for each site were calculated using three methods; recent L sed combined with monitored L out , Table 3 Coring location details and lake properties. Data source given in Table 2 Lake Mean recent TP mg m -3 These are referred to as R P (sed), R P (v = 10), R P (v = 29), and R P (K&D). Again, it must be stressed that for each model we calculate a single temporally fixed R P value; there is no intent or basis here for establishing changes in R P through time. The L sed method for finding R P uses only the recent sediment record, together with monitored L out , and the other methods use a fixed q s value based on recent sitespecific data. Long-term inferred lake water TP (TP lake ) was reconstructed for each lake using the four site-specific R P values and the published long core L sed records. The most recent part of these TP lake reconstructions were then compared with published water quality monitoring records (references in Table 2). The published monitored TP lake data have variable monitoring frequency and so have been converted to annual averages for use in this study. The exception to this is Lake Erie where an area weighted annual average was calculated from the published TP lake records for individual basins within the lake.

R P models
The lakes used in this study have areal water loading (q s ) values ranging from 7.1 m yr -1 (Lake Erie) to 22.6 m yr -1 (Hatchmere) (Fig. 1). This leads to a quite narrow range of model R P values; 0.56-0.80 for R P (v = 29), 0.31-0.58 for R P (v = 10), and 0.46-0.60 for R P (K&D). All being based on q s , these values correlate highly, the lowest r 2 being 0.98 for R P (v = 29) with R P (K&D).
R P (sed) correlates significantly with the model R P values (r 2 of 0.42, 0.49 and 0.54 for R P (v = 29), R P (v = 10) and R P (K&D) respectively), and has a similar mean value (0.61, compared with 0.45, 0.69 and 0.53 for R P (v = 29), R P (v = 10) and R P (K&D)). This comparison is illustrated graphically on Fig. 1, where only Lake Erie falls outside the range of model values.

TP lake reconstruction
For the recent sediment, the monitored lake water TP lake values and the range of reconstructed TP lake values can be compared by correlation analysis (Fig. 2). R P (v = 29) shows good agreement with the 1:1 line for Lake Erie, Lake Ontario and Hatchmere, but underestimates TP lake at Lake Annecy and Lake Plešné. R P (v = 10) does the opposite. For Loweswater there is little difference between v = 10 and v = 29. R P (K&D) shows reasonable fits for all but Lake Erie and Lake Ontario. These are sites that are known independently to have high apparent settling velocities. When inferred TP lake for Lake Erie and Lake Ontario is calculated using the published v sitespecific values, 19 for Lake Ontario and a mean value of 25 for Lake Erie (Chapra and Dolan 2012), then better fits are seen.
The inferred historical values for TP lake for the six case study lakes are shown in Fig. 3 for the period 1800 to recent (coring dates ranging 1970-2011).
Reconstructions are shown using each of the four R P values calculated for each site and are compared to lake water TP monitoring data. For Lake Erie and Fig. 1 Comparison of predicted and sediment-inferred R P values. Red line = R P (K&D), blue line = Vollenweider R P (v = 10), grey ribbon = range of Vollenweider model R P values, with v estimates from literature (v = 10 to v = 29). Symbols = R P (sed) for the six case study sites, with uncertainty estimated numerically (999 repeats simulating Gaussian scatter of both L sed and L out assuming SE = 10% for both) Lake Ontario, modelled TP concentration from Chapra and Dolan (2012) is also shown, which effectively project monitored data into the recent past.
The failure of sedimentary and monitored data period to overlap at two sites (Lake Annecy and Lake Plešné), and limited overlap at the other sites, precludes precise comparison. Of the predicted R P methods, the two Vollenweider model (v = 10 and v = 29) estimates of TP lake bracket the monitored values at all sites except Lake Erie, as shown in Fig. 2. In terms of universality, estimates based on R P (K&D) provide the best average match, though strongly overestimating the Great Lakes sites, where Vollenweider (v = 29) performs best. Constrained to do so, inferred TP lake based on R P (sed) provides the best match with monitored TP lake .

Discussion
The case study lakes A central objective here is to establish whether lake water TP can be inferred from sediment P concentration records when they are calculated as loadings. We have shown two methods that can do this, the first using model predicted R P and the second using tailored R P values based on the sediment record. As the latter approach constrains the most recent L sed values to be equal to the recent mean, the results cannot be fully evaluated using the monitoring data. Fig. 2 Comparison of sediment inferred TP lake (mg m -3 ) with modern monitored TP lake (mg m -3 ). Where sediment inferred and monitored data overlapped, the modern values are averaged across the overlap period. Otherwise, the most recent sediment inferred value was compared with the monitored value closest in time. Values based on R P (K&D) in orange, R P (v = 29) lower black dot, R P (v = 10) upper black dot. Site-specific R P values (blue cross) are show for Lake Erie and Lake Ontario. See text for explanation Fig. 3 Time series of TP lake , comparing sediment inferred with monitored values. Points = monitoring data. Grey ribbon = Vollenweider R P inferred TP range (v = 10 to v = 29).
Red line = R P (K&D) inferred TP. Yellow line = R P (sed) inferred TP. Blue line = model values from Chapra and Dolan (2012) Consequently, we focus first on testing the success using independent modelled R P values. Figures 1 and 2 compare sediment inferred lake water TP (TP lake ) based on differing R P values. Considering all values regardless of R P method, individual values lie as far as a factor of 3.6 from the observed concentrations (Fig. 2), with lakes other than Lake Erie and Lake Ontario lying closer to the 1:1 line. For an uncalibrated universal model this represents a remarkable result, the scatter being comparable to diatom inference models (Bennion 1994;Cumming et al. 2015). This shows that our approach produces sensible reconstructed values even when R P is suboptimal. The Vollenweider model based on upper and lower limiting velocities of 10 and 29 come close to bracketing the 1:1 line, showing that unsurprisingly site-specific choice of v would yield a better outcome. Values based on R P (K&D) yield better results for all but the Great Lake sites, just as a Vollenweider model with an intermediate v would have done. The Great Lakes sites stand out as the least well fitted. These are known to have exceptionally high apparent settling velocities (Chapra and Dolan 2012) and uniquely amongst our study sites have independently estimated velocities. If TP lake is inferred using these site-specific values, shown in Fig. 2, then results are improved relative to the overall spread of the data. This experimentation serves to show that appropriate sitespecific R P estimates give optimal results, greatly better than the factor of 3 range. Unfortunately, the site-specific data are rarely available as they require substantial long-term monitoring and modelling.
Specific deviations between monitored and inferred TP lake may be explained by factors other than R P choice and representativeness of the sediment cores. They can also be impacted by the use of total P measurement rather than specific sediment P fractions. Stream water TP concentration may underestimate the detrital P load, even when measured on unfiltered samples. In both Lake Erie and Lake Ontario it is known that a substantial part of sediment P comprises detrital apatite (Williams et al. 1976). This is likely under-represented in the inflow loadings, thus leading to inferred TP lake values that overestimate the monitored TP lake . Problems arising from poorly estimated R P and potential bias resulting from atypical sediment P fractions can be avoided by using tailored R P values based on the sediment data (R P (sed)). While hydrologically estimated R P values yield sensible results as described above, the R P (sed) should give results that are more accurate. R P (sed) inferred TP lake , constrained to give correct average recent values, shows divergence for older sediment that is assumed to reflect the differing nutrient histories at these sites (Fig. 3). However, there is no direct method to verify values that predate monitoring. There is, however, an indirect approach as illustrated by Rippey and Anderson (1996) at Augher Lough which shows good agreement between diatom inferred P loading and sediment P loading for sediment dating 1850-1980. We have recalculated their L sed records as TP lake using our method and find equally good agreement between our sediment inferred TP lake and their diatom inferred TP lake (Fig. 4). An apparent stationary P peak leads to a substantial discrepancy in the final 15 years for the record, but even with this r 2 is 0.95 for all intervals. This degree of agreement is remarkable for two wholly independent methods, particularly given the absence of site-specific parameterisation of our geochemical method, and a promising first comparison.
It is clear then that our method can produce sensible results even when based on the minimum available data (single core estimates of L sed , and R P based only hydrological data and a generalised model). With TP monitoring data, potential biases resulting from sitespecific factors can be reduced further.
Why hasn't this been done before?
The idea of linking the Vollenweider model to lake sediment records is not new but has been largely Fig. 4 Comparison of diatom inferred TP with our geochemically inferred TP for Augher Lough, Northern Ireland (Rippey and Anderson 1996). Our model is applied using only data provided in the original paper (R P (K&D) based on q s-= 12.1 m yr -1 , calculated from quoted value for Z-mean and q, the water flushing rate). Black line = diatom inferred TP. Red line = R P (K&D) inferred TP. Yellow line = R P (v = 10) inferred TP. Blue line = R P (v = 29) inferred TP overlooked despite its potential for TP reconstruction, and consequent value in managing lake eutrophication. Moss (1980) applied the Vollenweider (1975) model to a sediment P record from Barton Broad (Norfolk, UK), calculating lake water TP concentrations for four historic dates (1800, 1900, 1920, and 1940). Owing to the lack of information about approaches to generalising R P , Moss (1980) was obliged to use a simplifying approximation in the calculation (where v was neglected). Consequently, the full method was not applied. Furthermore, no monitoring data were available with which to verify the results of the reconstructed values and therefore the technique could not be fully validated. We have found no subsequent attempts to apply this method to calculate TP lake . Rippey and Anderson (1996) developed existing limnological mass balance approaches (Vollenweider 1975;Moss 1980;Ahlgren et al. 1988), and used their model to convert a diatom inferred TP record from Augher Lough (Northern Ireland) to historic lake P inflow loadings. They compared this to a record of lake-wide sediment P flux to test the utility of their model, finding good agreement (r 2 = 0.73). Jordan et al. (2001) also applied this method at Friary Lough (Northern Ireland). These studies reinforced a similar study by Dillon and Evans (1993), who compared estimated P budgets with sediment loadings for seven lakes in Ontario. Their study found reasonable agreement, and they concluded that the sediment P loadings are useful, and less ''tedious and expensive'' than determining hydrological P budgets. None of these studies attempted to turn their sediment P flux records into lake TP concentrations, and nor did they apply an R P coefficient in their calculations, despite commenting on its value. Much later, Boyle et al. (2013Boyle et al. ( , 2015 did make use of R P , using the empirical model of Kirchner and Dillon (1975), to calculate catchment P export from lake sediment records. They did not, however, extend this to the calculation of TP Lake .
While there has been little interest in applying the mass balance approach to sediment P records, a number of studies have reported on links between lake water TP and lake sediment P concentration data. These observed gross agreement between sediment P concentration records and historic observed changes in lake water TP (Shapiro et al. 1971;Engstrom and Wright 1984;Anderson et al. 1993), but also observed deviations between sediment records and historical TP changes, particularly in relation to changes in sedimentation rate (hence the need to use burial rates rather concentrations) and to surface P enrichments, which are discussed further below. While all of these studies (Shapiro et al. 1971;Engstrom and Wright 1984;Anderson et al. 1993) concluded that useful information was recorded in the sediment, they are nevertheless best remembered for questioning the reliability of sediment P records. At the same time diatom based TP reconstructions were developed and widely adopted (Hall and Smol 1992;Anderson et al. 1993;Bennion 1994). Consequently, despite the positive findings and methodological developments, the link between mass balance models and lake sediment P loading records has not been exploited by the palaeo community, precluding access to these useful quantitative estimates of historic nutrient baselines.
How generally applicable is our method?
The integrity of sediment P records has been reviewed by Engstrom and Wright (1984) and Boyle (2001), who conclude, based both on reasoning and field observation, that under favourable conditions (oxygenated hypolimnion, sedimentation rate high enough to minimise the role of diffusive P migration) lake sediments yield a useful record of the P supply history. While most subsequent studies agree with this position, three specific issues remain a matter of concern: the impact of hypolimnetic hypoxia on P retention by sediments; time lags caused by exchange between the lake sediment and the water column; and presence of temporary diagenetic P enrichment of the surface sediment. Nürnberg (1984) assessed the impact of anoxia on whole-lake P budgets and found that for lakes experiencing at least seasonal hypoxia mean internal loading amounted to 19% of external load, leading to reduction in lake P retention (R P ). In principle our model could be driven with a variable R P in response to such changing environmental conditions. However, in practice the factors governing reduced P retention are not well enough understood, and we have therefore not attempted to vary R P for any lakes in this study. The difficulty is separating the impact of anoxia from lagged consequences of past high external P loading. The Nürnberg (1984) study found that the anoxic sites in the data set were exposed to far higher external loadings (mean 2490 mg m -2 yr -1 ) than the oxic lakes to which they were compared (395 mg m -2 yr -1 ), and it is likely that a part of the budget imbalance was due to sediment-enrichment rather than oxic status. If so, then the average impact of anoxia may have been overestimated. This interpretation is supported by Prairie et al. (2001) who observed anoxia-enhanced P exports at only 2 of their 10 study sites. On the other hand, the analysis of Nürnberg (1984) only considers the lake wide budget, so could the impact on any specific sediment coring location be greater? This likely depends on the duration of the seasonal hypoxia. Where this is brief, any P released to the hypolimnion is likely to be returned to the sediment on breakdown of stratification (Mortimer 1941). If the hypoxia is permanent, then any released P is less likely to be returned to the point of release, but rather transferred laterally to shallower oxygenated sediment. In this case individual deep-water cores may have reduced P loads, while shallower cores have elevated loads in compensation. Hypothetically, the problem then becomes one of sampling. However, at present too little information exists to know whether this effect is sufficiently large to be measured, and thus whether there is a need for correction. Further research is needed in this area, and we simply advise exercising caution in applying the model to lakes subject to this effect.
The second issue relates to temporary storage of P in lake sediments, which has long been known to impact whole-lake P budgets on multi-year timescales. While the long-term ([[ annual) balance is controlled solely by L in , L sed and L out , a shift in external loading in the short term can lead to substantial temporary accumulation of P in the sediment, and thus a temporary increase in internal loading. Published data suggests this effect typically has a half-time in the order of 10 years or less (Nürnberg 1984;Marsden 1989;Jensen et al. 2006) which has the potential to produce a visible lag in the sediment record. Nevertheless, even if such lags are common, a long-term P loading reconstruction based on a steady state model will still yield useful information, providing an allowance is made for lags at the interpretation stage.
The third issue relates to long-lived P concentration enrichment of the uppermost sediment layers, which is widely reported (Carignan and Flett 1981;Engstrom and Wright 1984) and is attributed to diagenetic cycling of sediment P which transports P from deeper anoxic sediment to the oxygenated surface (Farmer et al. 1994). Such peaks migrate upwards as sediment accumulates, holding stationary with respect to the sediment surface. Consequently, these stationary peaks are not related to the contemporary external P supply and need appropriate treatment when interpreting the sediment record. The simplest approach is to disregard the affected portion of the record, limiting interpretation of the sediment record in relation to timescales of recent change. In this study, this corresponds to an affected record lasting approximately 10 years in the case of the recent sediments at Loweswater (Fig. 3).
A key question for anyone wanting to apply our method is whether the effects described above seriously compromise the record at a specific site, or for the specific objective of the research. The model output will be of uncertain reliability for lakes where changes in long term sediment P retention cannot be quantified. However, this does not mean the model cannot provide useful reconstructions of the timing and magnitude of change. For example, a number of studies have shown that sediment P peaks reflect historical timings (Engstrom and Wright 1984;Jilbert et al. 2020;Søndergaard and Jeppesen 2020), and in the case of Augher Lough (Fig. 4), a small hypertrophic lake, the TP magnitude is captured well by the model.
In contrast to changed P retention, it is important to stress that the temporary sediment P stores and stationary peaks described above do not impact the capability of sediments to record long-term ([[ annual) average lake water TP concentrations; information ideally suited to quantifying past lake water TP reference values. They do, however, prevent application of the methods to very recent change (* decadal) of the type best measured by monitoring data.
Will this work at my lake?

Data requirement
In order to apply the model to any lake, certain combinations of variables are required. Here we lay out what is needed, and how the values can be obtained. We then outline the assumptions that underlie our approach.
The model can be applied at a lake if two variables can be reliably estimated. (1) The lake-wide P loading record (L sed ) and (2) the areal water loading (q s ), which are needed to obtain the phosphorus retention coefficient (R P ). There are a number of approaches to obtaining each, outlined here. For key to notation see Table 1.
(1) L sed a. Minimum requirement, L sed ¼ L core Â z=z core b. Better, L sed ¼ 1 n P n 1 L core Â z=z core ð Þ , where there are n cores At a minimum there must be a dated P concentration record, with sediment dry density data, from which a P loading record for the core (L core ) can be calculated. Here we use total sediment P to capture the whole mass balance. The model has not been adapted for use with individual P fractions, though in principle a measure of labile P could be substituted for total P in order to exclude a terrigenous fraction. L core can be then adjusted to predict the lake-wide average P loading (L sed ) using Eq. 15 (Håkanson 2003). This correction for focussing is imperfect, but preferable to leaving the data uncorrected. Ideally, there should be multiple cores (Dillon and Evans 1993;Rowan et al. 1995;Rippey et al. 2008). If there were sufficient cores, randomly located, then simple averaging would estimate L sed . However, generally this is not the case. Instead, if there are multiple cores, L core can be calculated for each core, scaled by depth (Eq. 15), and then averaged.
(2) For q s , there are several different approaches, depending on what is known.
• With lake area and measured outflow q s ¼ Q=A L • With lake mean depth and water residence time q s ¼ z=s • With areal discharge, lake area, and catchment area q s ¼ R Â A C =A L • If only MAP and MAT available, use the method of Turc (1954) to obtain R, or textbook alternative methods for estimating evaporative loss.
Generally, more than one way is available in which case estimates can compared and combined. With q s estimated, R P can be found: • Minimum requirement, R P ¼ f q s ð Þ, found using either Kirchner and Dillon (Eq. 11), or Vollenweider (Eq. 10) • Or, if there is a monitored record of lake TP, R P ¼ L sed L sed þL out , where L sed is the mean sediment P loading for the recent record (ideally, corresponding to the period of monitoring), and L out ¼ q s Â TP lake If a history of varying q s is known then a corresponding variable R p could be calculated.

Conditions and assumptions
For the method to produce useful results certain conditions must be assumed.
• The sediment record is sufficiently well dated • The sediment record is representative of the whole lake • The sediment P signal is preserved • R P does not change during the record (or the variation in R P can be reliably quantified) These conditions do not differ greatly from those that underlie palaeolimnology in general. However, sediment records of P burial are sensitive to sediment focussing, such that sites are problematic where focussing is unpredictable (e.g., complex basins), or where deltas substantially contribute to the lake-wide P total. The sediment P signal preservation will be least good where low mass accumulation rates leave a substantial role for diffusive fluxes.
Under most circumstances we assume these conditions will be met. However, the model domain (i.e. where the model is applicable) remains largely untested, as at present we only have six case studies with sufficient information. Below we list some cases which might be expected to lie outside the domain, warranting further research. Clearly this list is not exhaustive.

Meromictic lakes
Lateral P fluxes in lakes with permanent or near permanent bottom water hypoxia are poorly understood such that the impact on individual sediment core records is uncertain. These of course also do not meet the Vollenweider model condition of a well-mixed lake.

Lakes with complex basins
In large lakes with complicated bathymetry (e.g., multiple deposition basins) it is difficult to reliably scale the core data up to lake-wide averages. 3. Unsampled climatic zones All our case studies are from temperate climatic zones. Although we have no reason to suppose there are problems, the behaviour of lakes outside this zone are untested. 4. Lakes with uncertain subsurface hydrology Substantial subsurface flows present a theoretical problem. Karstic lakes could in principle be modelled, providing there is a reliable measure of outflowing water. 5. Shallow wind-stressed lakes Shallow lakes that are subject to significant windgenerated mixing can have disturbed sediment profiles with hiatuses and inversions precluding meaningful record interpretation. However, some shallow wind-stressed lakes have uniform sediment and can yield reliable sequences. Thorough evaluation of record coherence is therefore essential.

Conclusion
Here we have adapted an existing steady-state model to allow reconstruction of long-term average historic lake water TP concentrations from the sediment P burial record, making an important contribution to the continued development of palaeolimnology as a tool for lake management. Our model can be applied without site-specific parameterisation, thus potentially having universal application. In principle the model is applicable at any site where there is both a sediment P burial record and knowledge of the current water budget, although as discussed above we advise caution applying it to problematic sediment records. Tested at six published case study sites, modelled lake water TP values agree well with water quality monitored data, and limited comparison shows good agreement with wholly independent diatom inferred lake water TP. These findings, together with a review of the literature, suggest that lake sediments can preserve a record of long-term average P burial rate from which the long-term mean lake water TP can be estimated. However, sub-decadal smoothing can limit application of the approach at shorter temporal resolutions and issues with preservation can limit the applicability of the model in certain instances. Our approach enhances the contribution of palaeolimnology to lake restoration by turning the sediment P record into a form more meaningful to lake management (long-term average TP concentrations). These reconstructed TP records can provide a long-term perspective on past lake water quality, and can be used to define site-specific reference values and nutrient targets used in lake management.