Bayesian Analysis of 210Pb Dating

In studies of environmental change of the past few centuries, 210 Pb dating is often used to obtain chronologies for sedimentary sequences. One of the most commonly used approaches to estimate the age of material at different depths in a sequence is to assume a constant rate of supply (CRS) or inﬂux of ‘unsupported’ 210 Pb from the atmosphere, together with a constant or varying amount of ‘supported’ 210 Pb. Current 210 Pb dating models do not use a proper statistical framework and provide poor estimates of the uncertainties. Here, we develop a new model for 210 Pb dating, where ages and values of supported and unsupported 210 Pb form part of the parameters. We apply our model to a case study from Canada as well as to some simulated examples. Our model can extend beyond the current CRS approach, deal with asymmetric errors and mix 210 Pb with other types of dating, thus obtaining more robust, realistic and statistically better deﬁned age estimates.


Introduction
Radiometric dating is a series of techniques used to date material containing radioactive elements (Gunten et al., 1995) which decay over time. These techniques use the radioactive decay equation (N (t) = N 0 e −λt , where N (t) is the quantity of a radioactive element left in the sample at time t, N 0 is the initial quantity, and λ is the element's decay constant) to infer the age of the material being investigated. 210 P b (lead-210) is a radioactive isotope which forms part of the 238 U (uranium) series. 238 U (solid) is contained within most rocks and decays into 226 Ra (radium, solid), which later decays into 222 Rn (radon, gas). Since 222 Rn is a gas, a proportion escapes to the atmosphere where it decays into 210 P b (solid) which is later transported to the earth's surface by precipitation. 210 P b deposited this way is labelled "unsupported" or excess 210 P b (P U ). On the other hand, 222 Rn which decays in situ becomes what is labelled "supported" 210 P b (P S ). By distinguishing between supported and unsupported 210 P b one can determine the age of the sediment through measuring the 210 P b at a depth d and compare it to the rest of the sediment.
The sediments within lakes, oceans and bogs contain layers of biotic and abiotic fossils, which can be used as indirect time-series of environmental dynamics as the sediments accumulate over time.. Whereas both unsupported and supported 210 P b decay over time, supported 210 P b is replenished through decay from radon contained within the sediment.
That is why the concentration of supported 210 P b remains at equilibrium while that of the unsupported 210 P b decreases and eventually reaches zero. This is the basis for 210 P b dating.
Given its relatively short half-life of 22.3 years, 210 P b has been used to date recent (< 200 yr old) sediments. This time period is of particular importance when studying the effects of humans on the environment. Unfortunately, the current dating models were not created within a statistical framework. This has given place to a series of uncertainty approximations (Appleby, 2001;Sanchez-Cabeza et al., 2014). Here we introduce both a new treatment of 210 P b data and a dating model created within a statistical framework, with the objective of providing more reliable measures of uncertainty.
2 Modelling of 210 P b data As outlined above, within sediment 210 P b is naturally formed from two sources -from surrounding sediment and rocks containing 238 U (supported), and from the atmosphere through 220 Rn (unsupported). Modelling these two sources of 210 P b is crucial to the development of age-depth models. Since supported and unsupported 210 P b are indistinguishable from each other, in order to model both sources, we have to make assumptions depending on the measurement techniques used. Measurements of 210 P b can be obtained by alpha or gamma spectrometry. The latter technique also provides estimates of other isotopes such as 226 Ra, which can be used as a proxy of the supported 210 P b in a sample (Krishnaswamy et al., 1971).

Supported 210 P b
If gamma spectrometry is used, supported 210 P b can be assumed to be equal to the concentrations of 226 Ra. When the sediments are analysed using alpha spectrometry, 226 Ra measurements are not available and estimates of the supported activity can only be obtained by analysing sediment which reached background (samples which no longer contain unsupported 210 P b). When alpha spectrometry is used, a constant supported 210 P b is assumed. These two different ways of inferring the supported activity can be formalised by the following equations: where P T i is the total 210 P b, P U i is the unsupported 210 P b and P S i is the supported 210 P b in sample i. Depending on the site and availability of measuring techniques, one of these equations can be used to differentiate supported from unsupported 210 P b.

Unsupported 210 P b
In order to model the unsupported 210 P b, some assumptions have to be made regarding the precipitation of this material from the atmosphere. A reasonable assumption for this phenomenon is the constant flux or rate of supply (Appleby and Oldfield, 1978), which implies that for fixed periods of time the same amount of 210 P b is supplied to the site.
Following Appleby and Oldfield (1978), the assumption of a constant rate of supply implies that the initial concentration of 210 P b at depth x (which is linked to age by a function t(x)), P U 0 (t(x)), weighed by the dry mass sedimentation rate r(t(x)), is constant throughout the core: where Φ is a constant. The dry sedimentation rate is the speed at which the sediment accumulates, weighed by the sediment's density at such depth, i.e.
where ρ(x) is defined as the density of the sediment at depth x and dx(t) dt is the rate at which the core accumulates with respect to time. Considering that the relationship between depth and time is expressed by the function t(x), then x(t) is the inverse function of time, and since t(x) is a one-to-one function Since 210 P b is a radioactive isotope it follows from the radioactive decay equation that where P U (x) is the concentration of unsupported 210 P b found at depth x, and λ is the 210 P b half-life. Using equations (3), (5), and (6) the following relationship is obtained, Considering that 210 P b is measured over a slice or section of the sediment, this relationship has to be integrated over such section to be related to the corresponding measurement, that is, where (a, b) are the lower and upper depths of the sample respectively and A U (a,b) is the activity in section (a, b). Equation (9) provides a link between the age-depth function t(x) and the unsupported activity in a given section. This is the primary tool to construct an age-depth model based on a constant rate of supply.

Current approach
The Constant Rate of Supply (CRS) (Goldberg, 1963;Robbins, 1978;Appleby and Oldfield, 1978) model is the most commonly used 210 P b dating model. It uses the constant rate of supply assumption presented in section 2, and the following equations to obtain a chronology: where A U (x) is the remaining unsupported activity below x, and A U (0) is the unsupported activity in the whole core. The CRS model can be summarized by equation (12) and from its term A U (0) one can deduce that this model depends strongly on measuring activity throughout the whole core. The effect of wrongly estimating this variable is described in Appleby (1998). If the activity cannot be measured throughout the entire core, interpolation is suggested (Appleby, 2001). Moreover, if the lowest sample has not reached background, and thus still contains unsupported 210 P b, extrapolation is suggested.
Because this model is based only on the unsupported activity, precise estimates of supported 210 P b are necessary in order to obtain reliable estimates of the unsupported 210 P b.
Depending on the equipment used to obtain the 210 P b concentrations, and on the model used to distinguish supported from unsupported 210 P b, this could be problematic. Wrongly estimating this variable will directly impact the estimate of A(0) which will in consequence affect the resulting chronology.

Example
To show the results of the current approach and later compare them to ours, data obtained from a site in Havre-St-Pierre, Quebec, Canada will be used. The core (HP1C) was obtained in July 2012 and was analysed using alpha spectrometry at Exeter University, UK . Table 1 contains the data from core HP1C. As previously mentioned, alpha spectrometry does not provide estimates of 226 Ra as is the case for beta spectrometry, but instead, contrary to the latter, it can measure far smaller quantities of 210 P b. To date this core, the CRS model was calculated using the recommendations in Appleby (2001).
One of the first steps to apply the CRS model is to identify the supported 210 P b. For this purpose the last 4 samples were averaged to obtain an estimate of 8.11 Bq kg and a standard deviation of 1.01 for the supported activity. This value was subtracted from the total 210 P b for each sample, to obtain estimates of unsupported activity. Following Appleby (2001) one can obtain the dating shown in Figure 1. This methodology requires very strong assumptions regarding independence, given the fact that it uses accumulated activity as the primary tool for inference. We now introduce our formal statistical approach for 210 P b dating, to solve this and several other issues inherent in the usual CRS technique just described.

A statistical approach to 210 P b dating
Let the concentration of 210 P b in a sample taken from section (x i −δ, x i ) be a random variable p i . In this case, it is important to clarify that this is not the cumulative concentration or activity, from the surface to depth x i , but rather it is the concentration found from depth separately and therefore it is safe to assume that each sample is independent of the other measurements and is normally distributed with mean the true concentration P i , and variance as reported by the laboratory: As mentioned above, the supported 210 P b is assumed to be in equilibrium throughout  (Appleby, 2001) showing the mean and 95% confidence intervals.
the core, which means that it remains constant through all depths. If 236 Ra measurements are available, a supported 210 P b value per sample can easily be included by letting P S i be different at each depth. It is important to note that this will greatly increase the number of parameters, and should only be used when the hypothesis of a constant supported concentration has been shown to be unreasonable. If a constant supported 210 P b is valid, then we can use equation (2) to infer the supported 210 P b. Now, assuming a constant rate of supply, as described in section 2, for the unsupported 210 P b, the activity in sample i can be obtained as follows Since the supported 210 P b can be assumed to be constant, the supported activity of It is important to note that the activity at each sample contains not only the information of ages but also of the supported 210 P b (P S ) and the initial supply of unsupported 210 P b (Φ) throughout the core.
To implement a Bayesian approach, prior distributions for each parameter have to be defined. Appleby (2001) suggested that the supply of unsupported 210 P b has a global mean of 50 Bq m 2 yr . This can be used as prior information to obtain a prior distribution for Φ. Because Φ is always positive, a gamma distribution can be considered and with this information we On the other hand, since supported 210 P b (P S ) varies much from site to site, defining an informative prior distribution for P S is in general not possible.
As a consequence, data for the supported 210 P b (P S ) is necessary (y S 1 , y S 2 , ..., y S ns ). These data can come from two different sources; 226 Ra estimates or 210 P b measurements which reached background. A prior distribution for the P S (supported 210 P b) associated with these data is necessary. Little is known about this parameter prior to obtaining the data. We have seen cores ranging from nearly 0 up to almost 50 Bq/kg of supported 210 P b. With this information, a gamma distribution with a mean of 20 Bq/kg and shape parameter of α S = 2 would allow the data to contribute more to the posterior value of P S . Lastly, to define a prior distribution for the ages an age-depth function has to be defined.

Age-depth function
Since sediment cores can extend back thousands of years, 210 P b is not the only technique used to date them. 14 C (radiocarbon) is a common way to obtain age estimates for organic material. The radiocarbon community has built sophisticated chronology models, which rely on equally sophisticated age-depth functions, with the objective of reducing and better representing the uncertainty of the resulting chronology. Because we want our approach to have the flexibility to incorporate other dating information such as radiocarbon, we decided to incorporate a well-established age-depth function.
Bacon (Blaauw and Christen, 2011), which is one of the most popular chronology models for 14 C dating, assumes linear accumulation rates over segments of equal length. By using this same structure, age-depth models based on multiple isotopes could potentially be obtained. This is the age-depth model we are going to use and we discuss the general construction of the Bacon age-depth function next (see Blaauw and Christen, 2011, for details). The age-depth function is considered as linear over sections of equal length, causing depths to be divided into sections of equal length c 0 < c 1 < c 3 < ... < c K , noting that c 0 = 0. Between these sections linear accumulation is assumed, so for section c i < d < c i+1 the model can be expressed as where m = (m 1 , m 2 , ..., m k ) are the slopes of each linear extrapolation, and ∆c = c i − c i+1 is the length of each section.
With this structure a gamma autoregressive model is proposed for the accumulation rate of each section, m j = ωm j+1 + (1 − ω)α j where α j ∼ Gamma(a α , b α ) and ω ∈ [0, 1] is a memory parameter which is distributed prior to the data as ω ∼ Beta(a ω , b ω ).
Using the above age-depth function, and (16), the log-likelihood for the model takes the form This likelihood contains all the parameters needed by our approach. Using the prior distributions previously mentioned, a posterior distribution f (m, ω, Φ, P S |ȳ,ȳ S ) is defined, from which we intend to Monte Carlo sample the model parameters using MCMC. To allow for faster convergence of the MCMC, a limit to the chronology is considered. This chronology limit is inspired by the 210 P b dating horizon, which is the age at which 210 P b samples lack any measurable unsupported 210 P b.

Chronology limit
The 210 P b dating horizon was described by Appleby (1998) to be 100 -150 years, based on the available knowledge and measurement techniques at the time. The dating horizon of a given core is affected by different factors. The first of them is the equipment used to measure the samples. If certain equipment has higher precision than another, it will be able to distinguish unsupported from supported 210 P b down to deeper samples and thus provide ages further back in time. The other factor that affects the dating horizon is the quantity of initial unsupported 210 P b, which is directly affected by the rate of supply (Φ). When there is a larger initial unsupported 210 P b it will take longer for the unsupported 210 P b in a sample to become indistinguishable from the supported 210 P b.
We therefore decided to set a dynamic chronology limit for our method. This limit (t l ) will be determined by two factors -the rate of supply of 210 P b to the site (Φ) and the error related to the equipment used to measure the samples. For example, lets assume that the equipment used to calculate the concentration of 210 P b in a sample has a minimum error of 0.01 Bq kg . Now, assuming that the sample comes from a bog with a density ranging between .05 to 0.2 g cm 3 (Chambers et al., 2011), then once the unsupported activity in a sample reaches A l .1 Bq m 2 , it would become indistinguishable from the supported activity. This information could help us to calculate the dynamic age limit. By using equation (14) we have where A l is the minimum distinguishable unsupported activity in a sample related to the equipment's error, Φ is the supply of 210 P b to the site and λ = 0.03114 is the decay constant and considering that 1−e −λ λ = 0.98459 is a constant, It is important to note that this limit depends on the error of the equipment and on the origin of the samples, which are factors known prior to obtaining the data. Moreover, Φ is a variable of the model. This will allow the model to limit the chronology given Φ. Blaauw and Christen (2011) propose the use of a self-adjusting MCMC algorithm, known as t-walk (Christen and Fox, 2010), which will facilitate the use of these techniques to nonstatisticians. The t-walk algorithm requires two initial points for all parameters (Φ, P S , w, α) and the negative of the log posterior function which is called the energy function, U (Φ, P S , w, α |ȳ,ȳ S ) = − log f Φ, P S , w, α |ȳ,ȳ S .

Implementation and MCMC
A program (in python 2.7) called Plum is used to implement this approach and to obtain a sample from the posterior distribution. Plum has been tested on peat and lake sediment cores, as well as on simulated data, providing reasonable results with no tuning of the parameters needed; examples of these results can be seen in sections 5 and 6. The consistency of these results, with minimal user input, show how the t-walk (Christen and Fox, 2010) was a suitable choice for this implementation.

Model comparison
To implement our approach to the HP1C data presented in section 3.1, Plum was programmed to use the last 4 samples from Table 1 as estimates of the supported activity, using the rest of the samples to establish the chronology. Figure 2 shows the results of the CRS model in red and our approach in black and grey. From this comparison we can observe that both models agree with each other down to a depth of 25 cm, at which point the CRS model continues at a similar slope unlike our approach which provides younger estimates.
This uninterrupted growth of the CRS model can be explained by its age function which is a logarithmic approximation, invariably tends to infinity as unsupported 210 P b reaches 0 .
Even with these discrepancies both models have overlapping confidence intervals, with our approach providing a more precise chronology in the topmost part and a more conservative estimate for the deepest part of the core.
This example shows the potential of our approach in a 'well-behaved' real-world case study, but unfortunately we cannot observe the precision of this approach when confronted with more challenging data sets, such at those which did not reach background and/or with missing data. For this purpose, several simulations were created where we know the 'true' chronology and can observe how our approach behaves in more challenging circumstances.

Simulated Example
To obtain simulated data, a constant supply of 210 P b was defined as Φ = 150 Bq kg , and by using the constant rate of supply assumption from equation (3) we have P 0 (x)r(x) = 150.
At this point, we can define ρ(x) to obtain r(x) by using equation (5) and the age function Using these functions, simulated samples at any given depth can be obtained by integrating each function between the top and bottom depths of the sample. Lastly, to simulate supported 210 P b a constant value was added to the simulations such that P i = P S + b a P U (x)dx, where a and b are the top and bottom depths of the sample. For this simulation we set the supported 210 P b to P S = 20. To replicate the measurement errors related to the concentration of 210 P b, white noise was added such that P i + where P i is the concentration found in sample i and ∼ N (0, σ i ). This exercise provided us with the dataset in Table 2. We use this simulated data set to test the precision of our approach in various circumstances. For this purpose, the last three sample points were designated as estimates of the supported 210 P b.
The best scenario for 210 P b age-depth models is when every core section is measured, from the surface to where background is reached. In this scenario any approach should reach the best results, thus providing the complete information about the decay of unsupported 210 P b. This scenario can be simulated using the complete data set from Table 2. Figure   3 shows the comparison between the chronology obtained by our model and that of the CRS Appleby (2001) alongside the real age function, and how both models include the true chronology in their 95% intervals. By applying our approach to this scenario, we obtained a very accurate chronology by taking the mean of the MCMC simulations. This shows, unsurprisingly, that our model behaves quite well in the best-case scenario. On the other hand, the CRS model provides a shorter chronology, because some samples had to be discarded from the chronology. This is a direct result from the logarithmic approximation mentioned in section 5. In this particular case, the two bottommost samples had to be discarded since the last sample was smaller than the mean of the three samples used to calculate the supported activity. On the other hand, CRS estimates younger ages for this example, which can be a result of the underestimated supported 210 P b as can be observed in figure 3. Another feature of the CRS worth mentioning is the rapid growth of the chronology in the last sample. As previously mentioned, this rapid increase can be attributed to the logarithmic approximation the CRS uses. Blue curve and shadow indicate CRS mean and its corresponding 95% range. Dashed black curves indicate mean and 95% confidence interval for our model. Grey lines are simulations from Plum. Red curve is the true age-depth model. The curve in the right represent estimates of the supported 210 P b (P S ) using the CRS model (red; dot shows the mean, parentheses show the standard deviation) and Plum (black curve). True supported 210 P b (P S ) is marked by a blue line.
The following scenarios deal with the behaviour of our model in circumstances where there is not complete dating information. Even if we attempt to use the CRS model to provide age estimate in these scenarios, it does this by interpolating and extrapolating in the sections where there is missing data. Applying the CRS model to these simulations would require us to take several additional heuristic decisions with large potential impacts on the chronology (e.g., exponential or linear extrapolation to beyond and/or between the dated levels etc., see Sanchez-Cabeza and Ruiz-Fernández (2012)). Such comparisons lie outside the scope of the present work but will be explored in a future study and consequently for the next examples we only study the performance of the Plum chronology.
Sometimes researchers do not have the funds to obtain a full, continuously measured dataset for the chronology that they want to build. When this is the case, only certain strategically placed samples are measured. To simulate this scenario, only the data at odd depths was used to obtain the chronology. Figure 4 shows the results from this experiment.
The accuracy of the model did not change as it still gives an accurate estimate of the true age model, and the precision was not greatly affected even though only half of the available data was used to calculate this chronology.
A common problem in 210 P b dating is not reaching background. To observe the behaviour of our model in these circumstances, the bottommost seven data points were removed leaving us with a dataset which has not reached background. Figure 4 presents the resulting chronology compared to the true age function. The chronology seems to be accurate down to a depth of 16 cm, from which point it seems to provide older estimates.
On the other hand, the model encloses the true chronology at all times even for the older ages.
The last scenario to which our approach was tested is missing topmost sediment. For this example, the data points with a depth of two to ten cm depth were removed leaving us with a data set with topmost missing data. Figure 4 shows the results of this experiment.
Even with a third of consecutive missing data, the model is able to accurately reconstruct the true age function.
Our approach behaves well in every tested scenario, as its accuracy is not greatly affected by any of the different scenarios we introduced.

Discussion
The approach developed here presents a more robust methodology to deal with 210 P b data.
The advantage comes from a more realistic measure of uncertainties, since the ages are parameters which are inferred in the process. Moreover, dealing with missing data, which is a common problem when dealing with 210 P b dating, becomes easier because our model does not need the whole core to be measured to obtain accurate results. Also, since the CRS model relies on a ratio, that approach requires removal of the bottommost measurement.
Since our methodology does not rely on a ratio, all the samples provide information to the chronology, making longer chronologies possible.
Because of the integration of the supported 210 P b into the model a posterior distribution of this variable can be obtained, as well as for ages at any given depth (not just those with 210 P b measurements) and the supply of 210 P b to the site. Figure 2 shows the posterior distributions of the supported 210 P b and the supply of unsupported 210 P b. These posterior distributions provide more realistic estimates of the uncertainty of these variables, which may be used for other studies where the main focus is not the chronology but other aspects of the site.
Another advantage of this methodology is the fact that since the model operates within a Bayesian framework, incorporating extra information is possible without having to 'doublemodel' by using previously modelled ages within an age-depth model. This information could come in the form of other radiometric ages, such as radiocarbon determinations. Since measurements of radiocarbon and 210 P b, given the age, are independent, the overall likelihood would consist of two parts; the likelihood from 210 P b and from 14 C. Therefore, Considering that the only link between both data is t(x), by using the same agedepth function such as that from equation (17), a chronology with both sources of data is possible. This becomes very important because the calibration curve (Reimer et al., 2013), which is used to correct the radiocarbon ages, is non-linear for the most recent few centuries, causing problems with interpreting radiocarbon ages. This period is partly covered by 210 P b.
By combining these two methodologies, more robust chronologies can be obtained for this important period in human and environmental history.