Introduction

Like all spatial features, plant distributions and their changes through time are best appreciated when displayed on a map. One of the first maps illustrating the distributional change of a species during the Holocene was the reconstruction of the maximal range of Corylus avellana in Sweden (Andersson 1902). Anderson collected the reports of hazelnut finds without knowing their stratigraphic position or having any means of dating them. Peat stratigraphy provided the chronological framework for von Post’s (1818) first pollen diagrams, where he used the distinct shift in peat humification around 2700 cal. b.p. to correlate the diagrams. Von Post realised that general changes in forest composition through time could be used to assign the strata to distinct periods. Changes in peat stratigraphy had previously been used to divide the Holocene into climatostratigraphic periods with distinct wet or dry phases (i.e. the Blytt–Sernander scheme, see Birks and Seppä 2010) and von Post (1924) used these periods to characterize their general forest composition. Using this chronological framework, and pollen data from about 250 profiles, von Post (1924) published the first maps depicting changes in tree abundance in time and space. Dividing the Holocene into six periods he constructed maps over southern Sweden for different species, where abundance is represented by circle diameter. Szafer (1935) developed the mapped representation of pollen-analytical results further as he treated pollen percentages from 152 sites across Poland like elevation data and produced the first isopollen maps. The recognition of different herbaceous pollen types, and in particular the identification of cereal pollen by Firbas (1937), meant that archaeology and history could be used to provide chronological information on the vegetation history over the past few thousand years. Firbas (1949) also presented isopollen maps for Central Europe, dividing the Holocene into four broad periods following the Blytt–Sernander scheme.

The possibilities of comparing pollen diagrams changed completely with the advent of radiocarbon dating in the 1950’s. It became possible to plot pollen diagrams on an independent time scale and compare the timing of events rather than their relative position with respect to some other common pattern. Initially, radiocarbon age determinations were expensive and pollen-stratigraphical zones remained a generally accepted dating scheme in many parts of Europe. Radiocarbon dating was used to derive ages for the regional pollen zone boundaries and revealed that early Holocene vegetation development was often diachronous (Smith and Pilcher 1973). For example, as the expansion of Picea abies is one of the major changes in Finnish pollen diagrams this feature was often directly dated with the radiocarbon method. This resulted in Aario (1965) and Aartolahti (1966) constructing isochrone maps depicting the timing of P. abies population expansion in Finland. For eastern North America Bernabo and Webb III (1977) presented isopollen and difference maps based on 62 radiocarbon dated pollen diagrams. While Bernabo and Webb (1977) used percentage thresholds to derive isochrone maps, Davis (1976) depicted the postglacial spread of trees on isochrone maps basing the local establishment of trees on pollen accumulation rates.

The publication “An Atlas of Past and Present Pollen Maps for Europe” by Huntley and Birks (1983) presented the first study of postglacial changes in the abundance of 46 pollen taxa through time and space at the sub-continental scale. The isopollen maps assembled in this atlas were mainly compiled from printed pollen diagrams, from 423 sites across Europe. Two thirds of these diagrams had some independent dating control with a total of more than 1,500 radiocarbon age determinations. However, compiling data from printed diagrams has limitations and it was realised that databases containing the pollen counts were needed for different applications including climate reconstructions (Fyfe et al. 2009). As a consequence the European Pollen Database (EPD) was created in 1992 alongside with the North American Pollen Database (NAPD). The NAPD had a predecessor, the COHMAP pollen ‘database’ managed by T. Webb III, which was the basis for sub-continental climate reconstructions and data–model comparisons, and also yielded maps of past vegetation change (e.g. Prentice et al. 1991). This was followed by a comprehensive set of isopollen and derived biome maps for the past 21,000 years covering much of North America and mainly drawing on the data stored in the NAPD (Williams et al. 2004). The maps were constructed as 1,000-year time slices using the calendar age scale and collecting pollen samples in 500-year wide windows. However, the age models for the underlying individual pollen diagrams were based on the uncalibrated radiocarbon time scale. The age models stored alongside the pollen data in the EPD were also mostly based on the uncalibrated radiocarbon time and the few maps of individual taxa that used the EPD were presented using this time scale (e.g. Brewer et al. 2002; Magri et al. 2006). While the uncalibrated time scale is in most instances sufficient to document the diachronous nature of the spread of taxa, the offset between uncalibrated 14C and calendar ages biases interpretations of these abundance changes and hampers the comparison with other proxy information based on calendar time scales.

After almost 100 years of palynological research in Europe the general vegetation history of the continent is well known. Updated or detailed maps only exist for selected taxa like Picea (Giesecke and Bennett 2004, Latałowa and van der Knaap 2006) or for particular regions or countries such as Poland (Ralska-Jasiewiczowa et al. 2004), Spain (Carrión et al. 2010) and the Alps (Burga et al. 1998, van der Knaap et al. 2005). Over the past decade there has been renewed interest in aspects of vegetation history from neighbouring disciplines, and this interest is ongoing. Genetic markers are being used to reveal ancestries between tree populations across Europe, thus indicating routes of postglacial spread of trees (Petit et al. 2002, Magri et al. 2006). Hindcast climate simulations make it possible to compare past vegetation distributions with vegetation models. This enables evaluation of climate simulations as well as the vegetation models, and eventually gives the opportunity to explore the relationship between shifting tree distributions and climate change over the last 20,000 years (Giesecke et al. 2007, Pearman et al. 2008). The construction of maps of past vegetation, as well as most other research questions utilizing pollen data, requires that the samples can be assigned an accurate age estimate. Studies that need accurate sample-age estimates include the reconstruction of continental-scale trends in climate change (Davis et al. 2003) or the reconstruction of landscape openness through time (Gaillard et al. 2010, Nielsen et al. 2012).

The growing need for calendar time scale chronologies for the pollen data in the EPD is met with the work outlined here. We also present a system indicating the uncertainty on the age estimate for individual samples and discuss constraints and considerations for estimating ages and uncertainties. We describe how the uncertainty information can be used for sample selection. Finally, the chronologies have been used to map the distribution of selected pollen types for the last 15,000 years based on the data in the EPD and taxonomic considerations for this type of analysis are discussed. All work presented here is based on the version of the EPD released on 2nd August 2012.

Dates: building blocks of chronologies

In this paper a date is taken to mean a particular depth in the stratigraphy with some information about the point in time when the material was deposited. This may be an absolute age estimate obtained from the decay of a radioactive isotope (e.g. radiocarbon) or an event that is found in several records and can thus be used for correlation of two or more sequences. The best relative marker horizon is the fallout of a volcanic ash (tephra) that is spread over a large area and is easily identifiable. Clear signals in proxy records that have a sudden regional or continental cause and are thus synchronous can be used as a date. However, where changes in the proxy are used for chronological information these changes become unavailable for analysis as this would lead to circular reasoning (Blaauw 2012). For example the onset of the Holocene is defined as a strong temperature increase of global scale (Walker et al. 2009) that can be identified in many pollen diagrams. However, by using this feature as a date it is not then possible to investigate whether the palynologically-detectable onset of the Holocene was synchronous across Europe. When building chronologies for individual sites, these relative dates have to be expressed as absolute ages and estimates of their uncertainty provided. Physical age measurements such as radiocarbon dates are reported with an uncertainty, but the sample dated may not have been ideal and the resulting age estimate may be erroneous. Additional age information may be included when building chronologies and some radiocarbon ages may be omitted. The resulting mixtures of dates are here referred to as control points. The most common control points for Holocene and Late Glacial pollen diagrams are based on radiocarbon measurements. The coverage of independently dated diagrams is still uneven and it is therefore necessary to include other sources of age estimates until additional diagrams with absolute dates become available. We reviewed and evaluated all different control points in the EPD, omitting those that could not be confirmed and adding others such as the onset of the Holocene where the chronologies would otherwise be unconfined. The labelling of the control points was refined to identify the different characteristics and uncertainties associated with the age estimates (Table 1), and to provide more detailed information for subsequent investigations.

Table 1 List of control points with associated age and uncertainty

Additional site information that can help to constrain the chronologies includes knowledge on basin formation and development, detailed regional vegetation and archaeological information, lithology, and additional proxy records. This full set of constraining chronological information was rarely available when constructing the chronologies presented here. However, we used general knowledge and regional vegetation history, and checked original publications whenever possible to evaluate chronologies and control points.

Control points and their uncertainties

Radiometric dates

Radiocarbon The most common control points are radiocarbon dates and the version of the EPD used here stores 5,323 such dates. Not all of them are used in the construction of chronologies. Authors usually report all 14C dates from a sequence even if some are deemed inaccurate. Radiocarbon dates can appear to be too old or too young for a variety of reasons. The best known reason is perhaps the uptake of old carbon by aquatic plants or mosses (MacDonald et al. 1991, Björck and Wohlfarth 2001). Hard-water and reservoir effects particularly affect bulk sediment dates and used to be common as conventional radiocarbon dating required large amounts of carbon-containing material (Grimm et al. 2009). The accelerator mass spectrometry (AMS) technique requires much smaller quantities of carbon, making it possible to work with terrestrial macrofossils in many situations. In theory plant fragments, seeds or fruits will be embedded in sediments shortly after their tissue assimilated atmospheric carbon and thus yielding the most accurate date, contemporaneous with the sediment deposited. Using small samples means that a tiny amount of contamination is sufficient to affect a date (Wohlfarth et al. 1998). In a search for sites that were dated with a high degree of confidence, Blois et al. (2011) classified radiocarbon dates based on their likelihood of inaccuracy. We did not apply a classification to the dates but used this type of information when evaluating them based on their stratigraphical context, and excluded spurious dates. Thus we used the reported uncertainty of a date regardless of whether it came from a conventional bulk date or an AMS date from terrestrial macrofossils.

Lead 210: Some cores held in the EPD were dated in their uppermost part using a radioactive isotope of lead (210Pb). This isotope has a half-life of 22.3 years and is thus useful to obtain a time control for the past 150 years. However, unlike radiocarbon its production varies regionally, requiring more complex models to obtain sample ages (Appleby 2001). These age models are often constrained by the abundance of other radioactive isotopes, such as Caesium (137Cs), which shows a peak for the time of peak atomic weapon testing in 1963 and, in many European regions, for the time of the Chernobyl fallout of 1986. We did not evaluate these age estimates and included them in the chronologies with their reported uncertainty.

Lithological dates

Core tops: Most late Quaternary cores were collected from sites where sediment is still accumulating. In these situations the sediment surface can be assigned to the year of sampling, which adds an important control point for the chronologies. Depending on the coring device it is not always possible to obtain the youngest sample in an undisturbed way, which is a particular problem in lakes. Wherever possible the relevant publications were examined to decide whether the core top is modern and how it was collected. Each core top was classified based on the information available to us at the time and slightly different ages are assigned to the different classed dates (Table 1). In situations where no information was available, the pollen composition of the topmost samples was evaluated and if it suggested modern vegetation composition a category C was assigned with an age of a.d. 1950 (0 b.p.) and an uncertainty of 250 years. Uncertainty in time was used here also to indicate the probability of loss of material as in this case the topmost sample would be slightly older. As future ages are impossible for sediments that were deposited in the recent past we used a half Gaussian error distribution to allow uncertainty in the past, but not in the future.

Annual laminations: Lakes with continuous annually laminated sediments provide the best possible age control and the EPD stores some pollen records from such sites. These datasets have ages assigned to all samples and each sample is therefore a control point. However, varve chronologies are rarely perfect. Slumps can interrupt the contiguous sedimentation and differences in varve-counting replication create uncertainties in sample ages (e.g. Brauer et al. 2000). Annually laminated sediments do not always occur all the way to the sediment–water interface, requiring dating of the floating varve chronologies by radiocarbon or linking to tephra layers. We assigned an error of 1 % of the ascribed age unless error estimates were provided with the varve chronology.

Tephra layers: Tephra layers that can be identified and assigned to an age are important control points for many European pollen diagrams. Northern Europe has mainly received volcanic ash from the eruptions of Icelandic volcanoes. Eruptions from various volcanoes in the Mediterranean provide chronological information in this region (Davies et al. 2012). The most common tephra in central European cores is the Laacher See Tephra, which is easily recognisable and has been dated to 12880 ± 120 cal. b.p. (Brauer et al. 1999). Recent developments in tephrochronology enabling the extraction of small amounts of shards (microtephra), will in the future enable the development of a more spatially-extensive tephrostratigraphy across Europe (e.g. Lane et al. 2012).

Biostratigraphic dates

A large proportion of pollen grains deposited at any site are derived from the regional vegetation (Sugita 1993). Therefore two pollen diagrams from adjacent basins will reflect similar changes in regional vegetation composition and if such features are well dated for one diagram their ages can be used as control points in the other. Regional vegetation change itself was often determined or triggered by changes in climate or human land-use which shows some synchrony over large parts of Europe (Giesecke et al. 2011). The onset of the Holocene climate warming caused vegetation shifts globally and Late Glacial fluctuations such as the Younger Dryas can be identified in many regions worldwide. The age and duration of these dramatic climate shifts are being determined with increasing accuracy (Rasmussen et al. 2006) and where the pollen-analytical correspondence is identified they can serve as valuable control points. While Rasmussen et al. (2006) dated the onset of the Holocene in Greenland to 11700 cal. b.p., well-dated lake records in Europe suggest a somewhat younger age for the palynological signal (e.g. Blockley et al. 2008) and we therefore assumed an age of 11500 ± 250 cal. b.p. The onset of the Holocene was used as an additional control point in many diagrams while the onset of the Bølling and the Younger Dryas were less frequently used. The rise of Corylus avellana is broadly parallel in many northern and western European regions (Giesecke et al. 2011) and was used, where necessary, as a control point with a 500-year uncertainty in these regions. The timing of the decline in Ulmus pollen varies somewhat between regions but its regional age was also used as a palynological control point if no independent dating was available. Where authors had used more palynological control points in their data submission to the EPD we tried to assess how the age information was derived and assigned uncertainties or omitted control points that seemed vague. Knowledge of the timing of archaeological periods can also provide dating control, where the pollen diagram indicates individual settlement phases. The introduction of cultivated plants such as Juglans may serve as a date, but this only provides a “younger than” date as the pollen sample may post-date the introduction of the cultivar in the region. In some cases historical events, such as the 30-years war in central Europe (1618–1648), can be identified and provide very accurate and precise time control (Giesecke 2001).

The addition of each palynological control point reduces the degrees of freedom of the overall dataset and palynological control points were therefore only included where the chronologies could not be sufficiently constrained. Some pollen diagrams stored in the EPD are derived from landscapes that changed little over the course of the Holocene or contain features that have not yet been well dated in the region. In these situations, we did not attempt to introduce control points other than the onset of the Holocene and the resulting chronologies remain poorly constrained.

Estimated basal ages

In recently deglaciated regions the timing of glacier retreat is often known with some confidence. Lakes or bogs located within past glacial limits did not usually accumulate sediments prior to glacial retreat. However, particularly in lakes, initial sedimentation was often rapid and slowed down with the establishment of dense vegetation cover. Down-core extrapolations from younger dates often cannot capture this initial rapid sediment accumulation and where constraining knowledge is available this is valuable in establishing the chronologies for the lowermost sections.

Building chronologies and propagating uncertainties

The choice of method

Sediment cores typically have fewer control points than samples and thus there is a need to make inferences about the sample age based on the available age-depth information. The exception is annually laminated sediments. An increasing number of methods and programs are becoming available to construct age-depth relationships and propagate the uncertainties of the age determinations to the samples (Bennett 1994; Heegaard et al. 2005; Bronk Ramsey 2008). Newly available Bayesian approaches (e.g. Blaauw and Andres Christen 2011) are powerful tools and provide more robust uncertainties for the age estimates of the samples. These techniques need much supervision and computing power and given the number and diversity of sites we chose against them. A more pragmatic approach was developed working with simple linear interpolation and smoothing-spline regression as implemented in CLAM (Blaauw 2010). Linear interpolation links the individual control points and results in abrupt changes in sedimentation rates that are in most cases unrealistic. Consequently control points causing age-depth reversals have to be eliminated as negative accumulation rates are impossible. Thus linear interpolation is rarely a realistic model of the real sedimentation history, but provides the simplest age model possible, and generally yields useful interpolations (Bennett and Fuller 2002). Smoothing splines provide a more general age-depth relationship, where the curve between two points is also influenced by more distant control points. Age reversals in control points do not necessarily lead to negative accumulation rates and can often be retained as it is often difficult to decide which date is reversed, in particular where probability distributions overlap. Fitting smoothing splines in CLAM has the additional advantage that the full width of a confidence interval guides the general trend of the resulting age-depth relationships. Thus control points with a large uncertainty work like a guide rather than a tight constraint. Where control points from absolute dates are mixed with control points derived from the vegetation history, the final age assigned to a vegetation change may differ from the age assumed for this event in a control point.

Implementation

All age-depth relationships were constructed using the R-code CLAM version 1.0.2 (Blaauw 2010; R Core Team 2012), which offers the advantage that radiocarbon dates are internally calibrated. This means that the dates can be stored as uncalibrated dates together with control points expressed as calendar-year ages. After evaluating all control points, they were run as a batch yielding two age-depth relationships per site, based on linear interpolation and smoothing-spline fitting with a default smoothing factor of 0.3. The two alternative models were visually evaluated, selecting the more appropriate one, with a general preference for the smoothing spline. However, smoothing splines presented problems when the density of dates varied markedly within a sequence, and could not be produced for sites with a low number (<4) of control points. Where neither of the two age-depth relationships provided an adequate chronology, the control points were edited and if necessary the smoothing factor changed. Uncertainties were obtained in CLAM through drawing random samples from the probability distributions of each control point and fitting 1,000 age-depth relationships to these points. The resulting uncertainties may be non-symmetrical, especially for the smoothing spline-based age models. The sample age is obtained as the highest-probability age based on the distribution of estimated ages from the 1,000 runs and the uncertainties are provided as 95 % confidence intervals. The half-Gaussian uncertainty distribution assigned to core tops generally results in modelled surface ages being biased toward older ages. As linear interpolations directly connect the assigned ages, half-Gaussian uncertainty distributions were not considered in these cases.

An additional uncertainty classification

The propagation of the age uncertainty from the control points to the samples is satisfactory where a high density of control points is available. The linear interpolation and smoothing spline methods do this in different ways. For linear interpolation, sample uncertainties decrease between control points. In smoothing splines, the sample uncertainties depend more on the curvature of the age–depth model and are often larger with increasing distance between control points. Where the time interval between control points is large, neither of the two procedures can adequately describe the uncertainty on the ascribed sample age, and will often underestimate the chronological uncertainties (Fig. 1). Therefore we created an uncertainty classification that largely reflects the density of control points to complement the assignment of uncertainty through interpolation and extrapolation. The classification is additive and samples are assigned to the lowest class (a single star) where the estimated sample age is within 2,000 years of the nearest control point. Additional stars are given at 1,000- and 500-year proximity to the nearest control point (Table 2). In addition to the three stars that characterize proximity to the nearest control point, a further star is given to samples that are situated in a straight section of the sequence. The ‘straightness’ star describes the necessity of additional control points to adequately describe complex sequences with large changes in sediment accumulation rates or hiatuses. The ‘straightness’ star is given to a sample where within the nearest four control points the modelled sediment accumulation rate changes less than 20 %. Only sequences with at least four control points can obtain a straightness star. The evaluation is based on the position of the sample relative to the control points and is independent of the interpolation procedure.

Fig. 1
figure 1

Age-depth relationship for Treppelsee, exemplifying the application of the classification based on control-point density and sequence complexity indicated on the left side next to the depth scale. The black curve and surrounding grey area indicate a smooth-spline age-depth model and its reconstructed 95 % chronological uncertainty bands, respectively. Between c. 7000 and 1000 cal. b.p., calculated 95 % uncertainties appear overly optimistic. The first (red) bar marks samples placed within a segment of the core bracketed by at least four control points within which the sediment accumulation changes less than 20 %. The three following bars (green, dark blue, light blue) reflect the proximity of a sample to the nearest control points. The lowermost control point represents the palynologically determined onset of the Holocene set to 11500 cal. b.p. with an associated standard deviation of 250 years (this can be identified from its obvious Gaussian distribution). The addition of this biostratigraphical control point constrains the extrapolation, which would otherwise yield ages that are too old. The blue polygons represent the calibrated age range as a distribution, where the height of the polygon represents the probability of that age

Table 2 Classification of sample age uncertainty

The age-depth relationship inferred for the sediment core from Treppelsee (Fig. 1; Giesecke 2001) gives a good example, illustrating how the star classification helps describe uncertainties for sample ages. The core (>25 m long) was dated with six radiocarbon dates, two of which are situated close to each other. The sediment–water interface was obtained using a freeze corer, giving a high confidence in the assignment of the core top to the year of coring. The beginning of the Holocene could be inferred from the pollen diagram and is used as an additional control point with a standard deviation of 250 years. From the control point marking the beginning of the Holocene until a radiocarbon date at 3000 cal. b.p., the reconstructed increase in sedimentation rate stays below 20 %, giving this section an additional star. Samples with inferred ages around 5500 years cal. b.p. have no control point within 2,000 years, but their inferred ages are constrained by the general age-depth relationship, giving them a single star. The sediment accumulation rate increased between 7000 and 3000 years cal. b.p., but it is uncertain when exactly this increase started and whether it was gradual or sudden. The confidence limits assigned to the samples through curve fitting do not reflect this uncertainty in this section of the core, but the decreasing number of stars helps to reflect the uncertainty of this change. Early Holocene samples attained 3 or 4 stars owing to the higher density of control points. Here the confidence limits for the sample ages describe the uncertainties derived from the control points in an adequate way. The most recent samples in the profile are well constrained by control points; the addition of a single control point would fulfil the straight-segment constraint.

Limiting chronologies

The most appropriate age-depth model for each sequence was assessed through checking individual pollen diagrams plotted against sample age and spatial interpolation of taxon abundances for different time periods across Europe. These evaluations made it necessary to restrict the analyses to a section of the pollen diagram, which was implemented as a restriction of the chronology. Restrictions were set for mainly two different reasons. First, in some cases it was impossible to constrain the age–depth model towards the bottom or top of the core and the extrapolations yielded unrealistic age estimates. For example, Late Glacial diagrams from areas that were not glaciated or where deglaciation occurred long before the onset of limnic sedimentation could often not be constrained. Extrapolation over several thousand years seemed unrealistic. Restrictions towards the top were less common but were set in cases where the core-top was apparently not modern, its age could not be otherwise estimated and extrapolation yielded inappropriate ages. Second, reworked pollen grains that are out of their correct biostratigraphic context are common towards the bottom of cores and during the Late Glacial, where erosion cut into older deposits. On a site-specific basis these reworked pollen grains are usually easily identified, while their assessment in collections of samples from particular time slices is difficult and may lead to misinterpretations. Although some of these samples may be adequately assigned to the age of their deposition, the pollen composition of these samples does not reflect the vegetation near the sampling site at that time. These cut-off limits were determined as sediment depth and stored in a separate table.

General-purpose chronologies and sample selection

The chronologies that were produced following the outlined rationale and procedure will rarely be the best chronology that may be achievable for any given site. Where pollen data were used to derive control points, the chronologies are dependent on the assumed timing of vegetation change. Nevertheless these chronologies provide more independent age control for the mapping of vegetation change than was possible in the dataset compiled by Huntley and Birks (1983). The chronologies should be adequate for most continental-scale questions and database queries. We consider them as general-purpose chronologies and stress that users should consider whether they are adequate to pursue their own research questions, particularly if these are site-specific.

The new calendar time-scale chronologies are an important step towards the mapping of European vegetation change and related questions. In addition the computation of sample age uncertainties on a database-wide scale provides new opportunities for sample selection and database queries. Individual sites are often well dated for some sections, while only vague age estimates can be provided for other samples. The twofold uncertainty classification provided here allows more flexibility for the inclusion/exclusion of samples for a particular analysis. Where high confidence in age control is necessary, for example in comparisons of vegetation patterns and with archaeological periods or to investigate vegetation responses to climate perturbations such as the 8.2 ka event, the investigator can restrict the analysis to samples with three stars and uncertainties within 250 years. Other applications, such as the mapping of late Quaternary vegetation change on a continental scale, do not perhaps require highly accurate ages but need a spatially-extensive set of samples: here one star and a 1,000 year uncertainty envelope may suffice. It has to be remembered that the star classification needs to be used in conjunction with the sample uncertainty derived though line fitting. For example, a diagram from a region with pronounced vegetation changes with regionally known ages may have a satisfactory density of control points and samples may be assigned two or more stars even if no or only few radiocarbon dates are available. However, the large uncertainties of these types of control point may result in large confidence intervals (>1,000 years) for individual samples. In constructing the chronologies we aimed to use as few control points based on vegetation history as possible and the vast majority of control points are derived from radiocarbon dates. The effect of the onset of the Holocene as a control point is visible in the histograms depicting the number of samples or core segments assigned to 3 and 4 stars (Fig. 2).

Fig. 2
figure 2

Histograms depicting the distribution of samples and the number of cores available in the EPD over the last 25,000 years in 500 year bins. The different colours illustrate the number of samples or samples in core segments that were classified with 3 and 4 stars and indicates good control-point density for most samples

Taxonomy: what are we mapping?

Since von Post’s lecture in 1916 pollen identification has advanced substantially, leading to the creation of ever more detailed pollen determination keys, pollen atlases (e.g. Fægri et al. 1989; Moore et al. 1991; Reille 1992, 1995, 1998; Punt et al. 1976–2009, Beug 2004), and pollen reference collections. Thus the detail of taxonomic identification has considerably improved through time. Unfortunately, nomenclature and definition of pollen types is not consistent. Depending on the determination key in use, different laboratories or schools use different nomenclatures, which make it difficult for database users to synthesise all relevant information. For example, many genera of the Asteroideae constitute together a single pollen-morphological type, but this type is named diversely, e.g. Solidago-type in Fægri et al. (1989), Aster-type in Moore et al. (1991) and Senecio-type in Beug (2004). Similarly, an analyst working in northern Europe may decide to name a pollen type according to a particular species (e.g. Sanguisorba minor) while a colleague working in southern Europe has to deal with more species or genera that produce the same pollen type (e.g. Sanguisorba minor and Sarcopoterium). Other nomenclatural obstacles arise when authors apply different plant nomenclatures (e.g. Umbelliferae/Apiaceae or Fallopia/Bilderdykia/Reynoutria). Eventually, pollen-type identification and hence naming also depends on the pollen preservation of a particular site as well as on the experience of the investigator, which usually grows over time.

The EPD aims to harmonize nomenclatural inconsistencies by implementation of a standard catalogue of pollen-type names and the introduction of a hierarchy. The EPD nomenclature is geographically confined to Europe and plant names follow the Flora Europaea (Tutin et al. 1964–1993). It finally enables database users to access the full data set of particular taxa regardless of whether different pollen-type names were originally assigned. However, the EPD stores the original nomenclature as supplied by the author. A table (P_VARS) within the database links the author’s pollen-type names to a common EPD nomenclature, while retaining all details carried by the original naming. The name of an EPD pollen type consists of either (1) the name of a plant taxon (e.g. Calluna vulgaris, Fagus, Brassicaceae), (2) two plant-taxon names separated by a slash (e.g. Humulus/Cannabis), or (3) a plant-taxon name with the suffix -type (e.g. Rumex acetosa-type). In most cases the EPD nomenclature follows the naming of recognised determination keys or pollen atlases (e.g. Fægri et al. 1989; Moore et al. 1991; Reille 1992, 1995, 1998; Punt et al. 1976–2009; Beug 2004) with the reference of the type name given. When a new EPD name is created, a full description or explanation is provided.

The P_VARS database table is also meant to link up each pollen type to a higher-level collective type. For example, the higher-level collective type for Secale, Zea mays, Hordeum-type, Triticum-type and Avena-type is Cerealia-type, which also comprises all ambiguous cereal pollen determinations such as cf. Triticum, Avena/Triticum, Cerealia undiff. Consequently database users can map either single, well-determined pollen types or the sum of all cereal pollen. An example from southern Europe is, experience and preservation permitting, the separation of Quercus into Quercus robur-type, Q. cerris-type and Q. ilex-type. The latter types predominantly include evergreen Quercus species. If circumstances do not allow the recognition of Q. cerris or Q. ilex-type, such pollen would be identified only to the genus level Quercus. A map of evergreen Quercus pollen can thus only be a minimum map: the true distribution and abundance of these pollen types is likely to be greater than the mapped area and/or abundance. All distinguished Quercus-types and Quercus together constitute the higher-level collective taxon Quercus. When mapping the higher-level collective taxon Quercus, database users receive a map of all Quercus pollen records including the more precise determinations.

Currently the P_VARS database table provides some taxonomic hierarchy that makes it possible to collect pollen identifications at taxonomically and ecologically meaningful levels. However, this hierarchy is not in all cases adequate or detailed enough and work is in progress to improve these links. For the current mapping of the data stored in the EPD, we considered the taxonomic groupings in the P_VARS table and made adjustments where necessary.

Applying the chronological information to the making of maps

The work outlined here had the overarching goal to produce chronologies that would allow maps to be produced that depict the data currently held in the EPD as a basis for further spatial analysis. The resulting maps will be presented elsewhere, and an example with a particular motivation is shown here to illustrate further considerations in the making of maps based on these chronologies.

There is considerable interest in assessing the intensity of human land-use in Europe during particular archaeological periods, such as the Medieval forest clearance, Migration period, Iron Age, Bronze Age and the Neolithic period. Pollen from Plantago lanceolata is a good indicator for human land-use and its presence and abundance may give some indication of the intensity of land-use (Behre 1981). In central and northern Europe P. lanceolata is the only species producing this characteristic pollen type. In southern Europe other Plantago species (e.g. P. lagopus, P. altissima, P. argentea) produce the same pollen type (van der Knaap and van Leeuwen 1997; Beug 2004). Nevertheless, increases in the abundance of the pollen type have also been shown to be a good anthropogenic indicator in southern Europe and the Near East (e.g. Behre 1990; Tinner et al. 2009). To minimize over-interpretations, we restrict the maps to identifications of P. lanceolata and P. lanceolata-type, excluding other Plantago pollen-types.

Pollen diagrams differ in their sample resolution, with a normal range of between 50 and 250 years. It is therefore better to average samples over a period of time rather than analyse a particular point in time. Here we choose a time window of 500 years. To reduce chronological uncertainty in the analysis the search was restricted to samples that were classified with at least 3 stars and an uncertainty of less than or equal to 250 years. The resulting average pollen percentages in each 500-year time window were plotted on a map (Fig. 3) displaying the abundances using symbol size and colour in a standardised classification scheme for easy visualization. Averaging pollen percentages over a time window leads to higher effective pollen sums and increased probability of detecting a pollen type. Thus the presence of the pollen type in the 500-year time window needs to be considered in a differentiated way, as a single occurrence in the time window may bear different information from its continuous presence at low abundance in most or all samples in the time window. The threshold of 0.1 % was used here, which represents the occurrence of a single grain in a count of 1,000 pollen.

Fig. 3
figure 3

Presence and abundance of Plantago lanceolata-type pollen during five time periods, chosen to represent the Medieval forest clearance, Migration period, Iron Age, Bronze Age and Neolithic period

Concluding remarks

This paper presents chronologies for the majority of sites currently stored in the EPD, which are based on a calendar time scale and include associated assessments of uncertainties for the inferred sample ages. A variety of control points were evaluated for 1,036 sites and resulting age-depth relationships were constructed based on either linear interpolation or (preferentially) smoothing splines. These line-fitting techniques were used to propagate the error from the uncertainties of the control points to the samples using the R-code CLAM. To complement these sample-age confidences, a classification capturing the dating density and the complexity of the age-depth relationship is provided. It thus becomes possible to select samples based on different requirements on the accuracy of the age control. This provides a general-purpose chronology fit for most continental-scale questions. However, we encourage any user working with a small number or individual sites to review the individual chronologies and where necessary construct new ones. It also needs to be noted that we did not construct chronologies for all sites in the EPD, but worked only with sites where some chronological information had been submitted to the EPD or where previous efforts had been made to establish a chronology.

Supplementary data

The control points used as well as the derived chronologies, uncertainties and classifications are stored in Pangaea (data are available at http://doi.pangaea.de/10.1594/PANGAEA.804597).