An objective method based on assemblages of subfossil plant macro-remains to reconstruct past natural vegetation: a case study at Swifterbant, The Netherlands

We present a new method of identifying past plant communities based on a palaeobotanical dataset. The dataset used as a case study consists of plant macro-remains retrieved from the Neolithic settlement Swifterbant S4, The Netherlands. Taxa were grouped based on their present-day concurrence values. Subsequently, phytosociological analysis was performed on the subfossil taxon groups using the software package PALAEOASSOCIA, adjusted for this type of research. Results show that syntaxonomic knowledge on the concurrence of plant species can be used to reconstruct parts of the past vegetation. We further discuss the theory behind the reconstruction of syntaxa, with special emphasis on actualism.


Introduction
The reconstruction of past vegetation in the vicinity of archaeological sites has always been one of the key goals in archaeobotany, giving insight into the conditions and exploitation possibilities of the area for its former inhabitants. In the present study, a new objective method is introduced for identifying past vegetation through phytosociology, the study of plant communities. For an introduction to phytosociology, see Braun-Blanquet (1964). The method applies to natural vegetation and the samples analyzed here were not, strictly speaking, from an archaeological feature. We will therefore refer to the samples as palaeobotanical instead of archaeobotanical.
In the case study presented in this paper, focus lies on the reconstruction of the regional vegetation around the site for the relatively brief period from 4300 to 4000 cal. B.C., and is based on the analysis of plant macro-remains. The methods presented, however, can also be applied to pollen, wood or, and perhaps preferably, a combination of all data available from the site under study. The methodology presented in this paper shows that reliable vegetation reconstruction based on phytosociology can be achieved, even with palaeobotanical samples representing a mixture of plants from different syntaxa (plant communities defined by phytosociology).
Once palaeobotanical data have been gathered, there are two established approaches for their interpretation towards a reconstruction of past vegetation: the individualistic approach and the assemblage approach, which have both been defined by Birks and Birks (2005, p. 343) and used for climate reconstruction (see below). These methods heavily rely on the uniformitarian assumption, also called actualism. Actualism can only fully be falsified if pure and complete samples are found, providing insight into the composition of a specific past vegetation type. However, pure and complete samples are rare for both macro-remains and pollen samples. Therefore it is necessary to find ways to divide and characterize taxon sets that clearly show a mixture of several vegetation types, as well as to define missing taxa.

Individualistic approach
The individualistic approach is based on information on the environmental optima and tolerances of a particular taxon. Abiotic values can be derived, for example, from Ellenberg et al. (1991) or Runhaar et al. (2004). These individual, taxon-bound values may be used to reconstruct specific abiotic conditions of the environment, like salinity or moisture availability (Behre 1991;Cappers 1995a). By combining different abiotic values, a taxon list can be divided into subsets probably sharing the same habitat. Thus, the individual approach is used as an indirect way to establish an 'assemblage' (see below) as well as an indication of the variability of habitats in the landscape. This approach is suitable, assuming that the response of the taxa to environmental factors did not change and that the combinations of environmental conditions are comparable between the past and nowadays (actualism), so that most probably the composition of vegetation did not change very much over time. A disadvantage of using abiotic values is that these are based on field observations of growth locations, but insight in which factors influence the occurrence of a taxon is lacking (Bogaard 2004, p. 7;Charles et al. 1997Charles et al. , p. 1152. Therefore, Charles et al. (1997) and Bogaard (2004) propose using functional attributes (biotic factors) such as leaf life span and root length to reconstruct vegetation types for which one might assume that a modern analogy of a combination of factors influencing the chances of a taxon occurring is lacking: a prime example is arable weed vegetation. Recent studies on historical changes in synanthropic vegetation (affected by human activities) confirm that changing land use and lifestyle considerably alter such vegetation (Lososova and Simonova 2008).

Assemblage approach
The community and assemblage approach explores the interspecific relationships (plant sociology) of plant taxa occurring together (concurring) at a site. The interspecific relationships of plants can be expressed in two different ways.
The first is by means of ecological grouping of taxa. Ecological taxon groups can be adopted directly from the literature (Arnolds and Van der Maarel 1979;Ellenberg et al. 1991, pp. 71-75;Runhaar et al. 2004, pp. 24-26), by adjusting adopted taxon groups to palaeobotanical datasets (Kreuz 2005, p. 85, after Ellenberg et al. 1991Out 2012, after Arnolds andVan der Maarel 1979), or they can be constructed manually. Manually means here that the groups are formed by the individual researcher, based on expert knowledge, for example of the taxon's past or current environment. The ordering of the data in ecological taxon groups is particularly useful in archaeological contexts, where the relationships between human impact and ecology are an important research goal.
Ecological taxon groups like 'arable weeds' and 'plants of trampled places' may be better suited to archaeological interpretations than possibly related synanthropic vegetation units like the syntaxa Veronico-Lamietum hybridi or Plantagini-Lolietum perennis. In contrast to syntaxonomy, where concurrence is based on many actual vegetation descriptions of taxa occurring together, ecological groups have been artificially created by combining plant taxa and environmental characteristics. Concurrence of the taxa in these groups needs not to have been actually witnessed in a real-life situation (Arnolds and Van der Maarel 1979, p. 305).
The second way to organise taxa is by phytosociology. This approach aims at identifying established plant communities which resulted in the palaeobotanical dataset under study. These plant communities have been empirically defined by mapping present-day vegetation in the field. There are several methods for the identification of syntaxonomical units manually (Van Geel et al. 2003;Van Zeist and Palfenier-Vegter 1981). Successful attempts to reconstruct past vegetation by modern analogues are presented in classic studies by Overpeck et al. (1985) and Körber-Grohne (1992).
The present study explores the possibility of treating a palaeobotanical sample as a sample of modern vegetation (relevée, or plot), enabling comparisons to the dataset comprising all Dutch relevées to reconstruct former syntaxonomic units (plant communities). By this means, we have an objective way of classifying the past vegetation, supported by a huge amount of comparative data. A comprehensive description of this methodology is presented below. The validity of using present-day syntaxonomy for the reconstruction of past vegetation is further explored in the discussion.
This case study is carried out on drift litter collected in the vicinity of the Neolithic site Swifterbant S4, dated from 4300 to 4000 cal. B.C. (Figs. 1, 2). There are three major advantages of using drift litter: (1) a high taxon number can be found within a small sample, which is time-efficient, (2) drift litter is less likely to have been disturbed by direct human activities than samples taken from settlement layers, and (3) most taxa found in drift litter are likely to be of regional origin, thus giving a good indication of the surrounding vegetation (Wolters and Bakker 2002, Table 4.5).

Swifterbant site description and sampling
The Swifterbant Culture consisted of Late Mesolithic hunter-gatherers (c. 5000-4700 cal. B.C.) and Neolithic hunter-gatherer-farmers (c. 4700-3400 cal. B.C.) in the central Netherlands. Overviews of this culture have been published by Raemaekers (1999) and Louwe Kooijmans (2005). The past environment of the Swifterbant region is traditionally characterized as an area of wetland creek systems (Fig. 1). This characterization is partly based on a study by Van Zeist and Palfenier-Vegter (1981), who published a vegetation reconstruction of the Swifterbant area using a phytosociological approach on plant macroremains from soil samples of settlement layers of the inhabited levee site S3 (Fig. 2).
The present study is based on palaeobotanical samples close to the levee site Swifterbant S4, located on the bank of a creek some 30 m northeast from the centre of site S3 (50°36 00 N 5°34 0 48 00 E; Fig. 2). In 2004, a new excavation was carried out here in order to gain a better understanding of the landscape, concentrating on the levee's shoreline and the creek fill, rather than on the levee itself. During the excavation, an accumulation of drift litter was discovered from which three samples were taken for palaeobotanical analysis (Fig. 2). Because of the presence of small amounts of archaeological material in the drift litter, it is dated in the period of occupation of the settlements on the levees, between 4300 and 4000 cal. B.C.
A single sample (sample A) originated from a broad accumulation of drift litter at 6.7 m NAP (Dutch Ordnance Datum) and two samples (B and C) originated from a single narrow band of drift litter at -6.0 m NAP on the same creek bank. An increasing rise of mean high water levels caused the whole creek system to be covered with clay sediments directly after the period of habitation (Ente 1976; Van de Plassche et al. 2005), preventing post-habitational contamination.
The total sample volumes were 11.5, 16 and 26 l for the samples A, B and C, respectively. Samples were wetsieved using various mesh sizes (4.0, 2.0, 1.0, 0.5 and 0.2 mm) and various volumes of the residues were checked for plant macro-remains with a stereomicroscope. Whereas the 4.0, 2.0 and 1.0 mm fractions were studied completely, smaller representative subsamples have been examined of the 0.5 mm (*25 %) and 0.2 mm (\10 %) fractions, until no new taxa were found within a reasonable time. Identification was carried out using the reference collection of the Groningen Institute of Archaeology and the Digital Seed Atlas of The Netherlands (Cappers et al. 2006). Nomenclature for taxa follows: Van der Meijden (1996), for syntaxa Schaminée et al. (1995a, b, 1996, 1998) and Stortelder et al. (1999.

Primary data analysis
This analysis aims to objectively identify plant communities that grew near the archaeological site. The only taxon omitted from the macrofossil dataset is Triticum turgidum ssp. dicoccon (emmer wheat), for it is absent from the reference set of present-day Dutch vegetation.
Our three samples were analyzed separately. We only used the presence of taxa in the samples, because many factors influence the quantitative relationship between standing vegetation and density of plant macrofossils. The samples were treated as if they were relevées in the process of the identification analysis. Since we used presence/ absence data of the taxa in the three archaeological samples, they are, however, not strictly relevées as these should include relative abundances of taxa. TURBOVEG, a software package for the storage and analysis of relevées (Hennekens and Schaminée 2001), does provide the option of importing relevées on a presence/absence basis, which would then more correctly have to be named 'taxa lists'. These taxa lists were then exported as a Cornell Condensed file (cc!) to be able to import the data into the analytic software package ASSOCIA (Van Tongeren et al. 2008). For our purpose, an extra routine was developed to estimate associations between taxa in modern vegetation, the modified version being called PALAEOASSOCIA. Taxa are considered to be associated if their concurrence is larger than estimated from their separate frequencies under the assumption of independency. For each pair of taxa (A and B) a contingency table was computed from the synoptic syntaxonomic tables (computed from Schaminée et al. (1995a, b, 1996, 1998) and Stortelder et al. 1999 available in PALAEOASSOCIA. Only those syntaxa were considered in which at least one of the subfossil taxa of the sample was present and, because there is no prior knowledge about the area occupied by the syntaxa, all syntaxa were given the same weight. The formulae that calculate the estimates for the probabilities of finding each of the possible four combinations of two taxa are given in Table 1.
Under the assumption that the taxa are independent, expected probabilities were computed with the formula in Table 2.   Table 1 is larger than the value computed in the Table 2 formula, the taxa are associated; if this value is smaller, the taxa exclude each other. The logarithm of the ratio between pðA and BÞ andpðA and BÞ is a symmetric index of association, which is positive for associated taxa, negative for taxa that exclude each other and ca. zero for taxa that are independent. In a spreadsheet, taxon groups were constructed in the taxon-by-taxon association matrix by manually reordering the rows and columns to obtain highly positive values along one diagonal and negative values far away from this diagonal. The manual reordering of taxa was made easier by applying conditional formatting Table 3. The taxon groups were made as extensive as possible to increase the chances of reliably assigning each group to a syntaxon.
The taxon groups were once again imported into TUR-BOVEG, exported as a Cornell Condensed file, and objectively labelled according to their association with a syntaxon using PALAEOASSOCIA. Because the subfossil taxon lists are incomplete, we modified the ASSOCIA routines so that the list of possible syntaxa was based on the weirdness index only (Van Tongeren et al. 2008). The weirdness index is calculated as the sum of all contributions to -2ln (likelihood) for the taxa present in the sample. If a taxon is present in a syntaxon, the contribution to the weirdness is low. In the original ASSOCIA package, the degree of association of relevées to syntaxa is also  based on the incompleteness index. This is the opposite of the weirdness index, calculated from the sum of all contributions to -2ln (likelihood) for the taxa absent from the sample but present in the association it is compared to. Since all palaeobotanical datasets have missing taxa (Küster 1991, p. 18), the incompleteness index is not applicable to palaeobotanical datasets.

Reduction of possible vegetation types
The obtained list of possible vegetation types was further constrained by three factors: First, a threshold was set for each taxon group, based on how much a suggested vegetation type may differ from the type with the lowest weirdness value (first suggestion). The threshold was calculated by adding the squared number of taxa in the group divided by 20, this latter value being arbitrary, based on the observation that few groups include more than 20 taxa, to the weirdness value of the first syntaxon suggested. This threshold was lower for taxonpoor groups, which tend to produce longer lists of syntaxa. All syntaxa for which the weirdness value exceeded the threshold were rejected.
Second, we chose not to accept basal or derivative communities, because these in particular are greatly influenced by human actions (Kopecký and Hejný 1974). Since human influence nowadays differs greatly from that in prehistoric situations, these communities cannot be compared.
Third, the suggested vegetation types were studied in more detail through the PALAEOASSOCIA diagnosis file. This file shows to what extent the taxa in a group fit a suggested syntaxon, and which taxa are normally present in that syntaxon but were missing here. Taxa listed in over 95 % of present-day relevées of a syntaxon but absent from the subfossil taxon group were reconsidered more closely. The probability of not finding these taxa in palaeobotanical analyses was roughly estimated by comparing the frequency of the reported recordings of such a taxon in the Groningen reference database as well as the Dutch database of palaeobotanical plant macro-remains RADAR (version 2006, listing 6,546 samples with 131,879 records of 3,552 taxa; for introduction in RADAR, see Van Haaster and Brinkkemper 1995). If an absent taxon is known to be rarely, if ever, found in archaeological samples due to unlikelihood of preservation, it might be considered to have occurred in the past landscape as a plausible addition to the taxa found in our samples. However, if a taxon is often found in palaeobotanical samples, its absence in our study is more significant and therefore we consider the probability that such a vegetation type had been present in our study area to be low.
On the other hand, if a taxon is present in less than 5 % of present-day relevées of a syntaxon but was found in the subfossil sample, the suggested vegetation type was considered unlikely to have been present and therefore excluded. Taxa only identified at the genus level were ignored in this step of the analysis, as the PALAEOAS-SOCIA program considers every taxon a separate 'entity'. It recognizes no taxonomic relationship between species within the same genus and between a species and the genus it belongs to. All taxa identified up to the genus level in the archaeological sample occur in most present-day relevées only at the species level and will therefore always appear as 'weird'.
This resulted in a limited list of syntaxa for each group of taxa within each subfossil sample. The combined list of syntaxa for the three subfossil samples can be used for the historic vegetation reconstruction of the vicinity of the study site. This reconstruction will be presented in another paper, by combining the syntaxonomic information with geographic information on the landscape.

Data selection
The taxon lists of the three subfossil samples (n = 47, 37 and 35; ESM Appendix A) show a high variation in habitat types, ranging from half-moist to aquatic, and both fresh and salt water. Samples A, B, and C were split into 9, 13, and 11 overlapping groups, respectively (ESM Appendices B-D). By making overlapping rather than exclusive taxon groups, we avoid restricting subfossil taxa to only one specific community (Küster 1991, p. 19). The groups were identified using PALAEOASSOCIA. Syntaxon codes (such as 29Aa2b) are built up hierarchically: the first two positions indicate the class, the following capital letter indicates the order, the following letter the alliance and the last number the association. An occasionally present last letter indicates a subassociation. This means that the syntaxon chosen as an example here is in class 29 (Bidentetea tripartitae), order A (Bidentetalia tripartitae), alliance a (Bidention tripartitae), association 2 (Rumicetum maritimi) and subassociation b (R. maritimi chenopodietosum). Because of this strict hierarchy, the syntaxonomical system can be defined as the plant sociology counterpart of taxonomy.
The number of suggested syntaxa is negatively related to the number of taxa in a taxon group: the more taxa in a group, the lower the number of suggested syntaxa. The number of suggested syntaxa ranges from 2 to 28. ESM Appendix E shows the syntaxa initially suggested for the taxon groups in sample A. First, all syntaxa were excluded that exceeded the threshold difference with the most likely syntaxon; Table 4 shows the thresholds for sample A. Subsequently all basal and derivative communities were excluded. The number of remaining syntaxa ranges from 1 to 8. The reduced lists for the three samples can be divided into three networks: wet communities, pioneer communities and woodland communities. These groups are shown for sample A in Figs. 3, 4, 5.
The remaining syntaxa were studied in more detail using the PALAEOASSOCIA extended diagnosis file, which led to further exclusion of syntaxa as shown in Table 5. A complete description of the decisions leading to this further reduction would stretch too far; a few examples will be discussed here.
In taxon group 8 of sample B, syntaxon 31Ab1b (Urtico-Malvetum typicum) is among the suggested vegetation types. Urtica urens is present in over 95 % of the modern relevées of this type. As this species is well recognized and often found in palaeobotanical samples, its absence here makes the former presence of this syntaxon highly unlikely. In the same line of reasoning, vegetation type 37Ac5 (Orchio-Cornetum) could be excluded as a possibility for group 11 of sample C. This vegetation type should have contained Cornus sanguinea which is an easily identifiable and frequently found species.
For the suggested vegetation type 31Ab2c in group 7 of sample B however, we acknowledge the possibility that the absence of Hordeum murinum may be related to its palaeobotanical invisibility, rather than to factual absence. In Dutch research, no finds of this species prior to 800 B.C. 123 have been recorded. The former presence of this syntaxon could therefore not be excluded, so H. murinum is considered a 'suggested species'. The comparison with present-day plant communities also provides the possibility of suggesting the presence of species not found in the samples. Another good example of a 'suggested species' is Limosella aquatica. In vegetation type 29Aa4, suggested for groups 2-4 in sample A (Fig. 3) and group 3 in sample C, this species is present nowadays. Though it was not found in the drift litter samples, it has been identified previously in the Netherlands in low numbers in palaeobotanical samples. Additionally, some taxa are considered 'weird species' in most of the cases. For example, Hordeum vulgare (barley) and Malus sylvestris (crab apple) are considered weird, though their occurrence is not impossible in some of the suggested vegetation types. Human activity in the vicinity of these sites may very well have played a role for these useful plants.
The vegetation types suggested for the three subfossil samples are summarized in Table 6. We emphasize that many are very closely related and likely to be found within a relatively short distance of each other in tidal landscapes.
They may even occur along a gradient or in a succession. This is supported by the observation that the association matrix shows substantial overlap in the taxon groups, suggesting a limes divergens (Westhoff and Van der Maarel 1978, pp. 303-305).

Discussion
By treating three palaeobotanical samples from a drift litter accumulation near a Neolithic settlement as present-day vegetation recordings on a presence/absence-level, we were able to compare them with an extensive set of modern Dutch phytosociological data. Using these data, we split the subfossil samples, which are clearly a mixture of several vegetation types, into a number of groups of taxa likely to concur. Concurrence networks have recently been used by Araújo et al. (2011) in studies on climate. For the taxon groups that we created, we identified the most similar plant association(s) described by Schaminée et al. (1995a, b, 1996, 1998) and Stortelder et al. (1999 via the analytic software package ASSOCIA (Van Tongeren et al. 2008). The results of the three samples are consistent with one Fig. 4 The taxon groups dominantly representing pioneer communities from sample A another and fit in well with existing knowledge on the geological and hydrological conditions of the prehistoric region.

Data analysis
We chose to use presence/absence because of the large discrepancy between the relative abundance of plant macro-remains compared to the abundance of plant species in present-day recordings. Direct translations of seed counts into relative plant abundance as performed by Körber-Grohne (1979) are hampered by both archaeological and ecological problems (Bekker et al. 2000;Van Zeist and Palfenier-Vegter 1981, pp. 133-134).
For instance, seed production has a high interspecific variability, but is also influenced at the intraspecific level by such factors as differences in reproductive allocation and effort (Bazzaz et al. 1992), pollination failure (Fenner 1985) and pre-dispersal seed predation (Crawley 1992). Also, seed dispersal potential has a high interspecific variability, resulting in patterns that deviate quantitatively from the standing vegetation. Seed dispersal potential is also a possible cause for qualitative dissimilarities between seed bank and vegetation: dispersal may result in the loss of taxa from the seed bank, whereas it may also result in the presence of taxa in the seed bank that are not members of the standing vegetation. Thompson and Grime (1979) showed a lack of general correspondence between taxon composition of the seed bank and the associated vegetation. More recently, Bekker et al. (2000) found Czekanowski similarity indices (quantitative Sörensen index) of 40-60 % between quantitative soil seed bank and vegetation data from dry and wet semi-natural grasslands, indicating that a high deviation of seed bank and vegetation composition is not uncommon. These deviations at both the quantitative and qualitative level hamper fine-scale vegetation reconstruction (Cappers 1995a). An additional problem is that accumulated drift litter is not a seed bank. Drift litters along big rivers and on seashores are especially likely to contain plant remains originating from vast areas both in space and time (Cappers 1993). A quantitative translation of our seed numbers into standing vegetation would require extensive studies on the relationship between species composition in an area and the accumulation of the remains in drift litters (Moore 1986, p. 545). Such studies have been performed by Holyoak (1984) and Wolters and Bakker (2002). However, to make such a study applicable to our dataset, it would have to be carried out in an area comparable to the area under study, which would inevitably require some circular reasoning.
Furthermore, a direct translation of the seed numbers of our mixed assemblage into standing vegetation would neglect the fact that some taxa may concur in more than one of the suggested vegetation types, but in different ratios. Our methodology makes it possible to first identify a particular plant community, and then use other knowledge of the local landscape (like geomorphology and soil characteristics) to estimate the location and relative abundance of that community in the region.
The methodology used to divide the association matrix into overlapping taxon groups is time consuming. An alternative and much faster way would be to cluster the taxa in the association matrix, treating it as a similarity matrix. However, the hierarchical level at which the clustering should be defined and the problem of ordering the clusters in a nonhierarchical network would still have to be solved.

Actualism
There are two ways in which a plant community that occurred in the region in the past may not have been identified. First, too few taxa of the plant community may have been found as fossils. Second, plant species concurrence may have changed since prehistoric times, resulting  The colour scheme follows the field guide to Dutch plant communities (Schaminée et al. 2010). Blue open water and marshes. Light green grasslands and heathland. Orange coastal and inland pioneer communities. Dark green rough thickets and woodlands Table 6 Summary of the suggested syntaxa for all samples. The colour scheme follows the field guide to Dutch plant communities (Schaminée et al. 2010). The numbers indicate how often the syntaxa have been suggested (see Table 5). Blue open water and marshes, light green grasslands and heathland, orange coastal and inland pioneer communities, dark green rough thickets and woodlands in non-analogue plant communities. The hypothetical presence of an unrecognized community that does have a present-day analogue should be considered a form of a false negative (Jackson and Williams 2004). Although they defined this term for a whole dataset of pollen (seemingly) lacking a modern analogue, it also applies to the current study's methodology. Our analysis is based on actualism, which applies to both the individualistic and the assemblage approach (Birks and Birks 2005, p. 343). Actualism assumes that characteristics of species and/or interspecific concurrence did not change over time. However, differences in plant sociology may actually occur when either taxa evolve or when abiotic conditions change into previously absent conditions. The likeliness of this assumption being valid decreases as the distance in time increases (Behre and Jacomet 1991, p. 83;Gee and Giller 1991). The ecological preferences and tolerances of plants are likely to have evolved only marginally in the time span the present paper is dealing with (Behre and Jacomet 1991;Cappers 1995b;Willemsen et al. 1996).
A combination of abiotic conditions in the past lacking a modern-day analogue can occur naturally or because of changes in human activity. In the earlier Holocene, natural conditions may have caused non-analogue habitats (Caseldine and Pardoe 1994;Gee and Giller 1991;Kalis et al. 2006;Overpeck et al. 1985). Several scholars suggest, however, that climatic conditions were roughly stable during the Holocene, especially during its second half (Oldfield 2005). In the period under study here (c. 4300-4000 cal. B.C.), this should therefore not be problematic. The uniformity of climatic conditions is supported by the observation that no taxa currently absent from the Dutch flora were found. Slight alterations in the characteristics of species play a smaller role on the community level, due to the smoothing-out of these differences as the number of taxa increases.
Ecological taxon groups or plant communities which were influenced by human activities may have changed considerably over time. This applies especially to arable weed floras due to changes in farming practices (Hillman 1991;Marshall and Hopkins 1990;Willerding 1979) and to cultivation in ecosystems that are not used for that purpose now. For example, cultivation of salt-marsh areas in the past may have resulted in the inclusion of halophytes in weed associations (Van Zeist 1974, p. 343). Therefore, syntaxonomy is less suitable for studies on plant husbandry rather than vegetation reconstruction (Bogaard 2004, pp. 5-6).
Following Van Zeist's (1974, p. 343) line of reasoning that arable weed assemblages will at least partly be a subset of the locally present 'wild' vegetation, an area-specific weed assemblage for this region may be expected to be a subset of the plants in taxon groups assigned to pioneer communities. The association matrices show that H. vulgare (barley) need not be a weird species within this dataset, but caution needs to be taken with cultivated plants because of the different ecological tolerances of presentday cultivars. Nevertheless, there is ongoing debate whether cereal cultivation took place here locally or not (Cappers and Raemaekers 2009;Out 2008Out , 2009Weijdema et al. 2011). It is beyond the scope of this paper to join this debate.

Wider geographical applicability
The methodology presented in this paper is useful for palaeobotanical studies in two ways. In the first stage, it provides an objective method to subdivide a plant list that clearly represents a mixture of vegetation types into sets of taxa that might have grown together in various vegetation types. The created taxon groups will overlap, which is to be expected in a natural landscape with many plant communities. Secondly, these groups can be identified phytosociologically. The methodology can be applied in every region for which synoptic tables of vegetation types are available, preferably containing all taxa retrieved in the palaeobotanical sample(s). For the non-palaeo version of the ASSOCIA package, available as a built-in identification tool in TURBOVEG for Dutch vegetation, a dataset of Czech grasslands has already been used to test its applicability in non-Dutch regions (Van Tongeren et al. 2008).
The synoptic tables to be used as a reference set should preferably originate from a region as nearby as possible to the region under study. This also applies to studies using Ellenberg et al. (1991) indicator values or other individual characteristics of taxa. Studies on the wider geographic applicability of Ellenberg indicator values have confirmed that the values originally defined for central Europe can also be applied to western and northern Europe (Godefroid and Dana 2007, and references therein), which is an indication that the Dutch reference set can be used in parts of neighbouring countries where no species are present which are absent from The Netherlands.
In regions that differ botanically from The Netherlands, other plant associations will have been defined, although both German and British overviews of plant communities contain most of our identified syntaxa (or synonyms) at least up to the alliance level (Pott 1992 for Germany;Rodwell 1998aRodwell , b, c, d, 2000 for Great Britain).

Concluding remarks
The analysis presented in this paper made it possible to reconstruct past vegetation consisting of the main components of wet, pioneer, and woodland syntaxa. Our new method will make it possible to gain insight in hydrology, geomorphology and soil characteristics in regions where they have not been so well preserved as at Swifterbant. The syntaxa within the Bidention tripartitae alliance, occurring on periodically flooded, fresh to brackish clay or clayey peat along creeks and ditches, seem to fit in well with the geological knowledge of the region (Schaminée et al. 2010, pp. 302-305). A further analysis of the results is in preparation, including the position of the syntaxa in the landscape and the implications of the vegetation reconstruction for the use of plant resources by humans. Although an exact match to the prehistoric situation can never be claimed, the use of factual plant communities as an analogue opens the way to use parameters of these communities such as biomass production and nutritional value, and also to create realistic reconstructions of the past landscape. This is of great value for the presentation of palaeobotanical information to archaeologists and to the general public.