1 Introduction

In recent years, we have witnessed large increases in global temperature (for example + 0.61 °C/century between 1850 and 2007 for northern hemispheric land air temperatures (Halley 2009)) and under the most conservative scenarios should reach + 1.5 °C by 2050 (IPCC 2007). To cope with changes in temperature, organisms have to either adapt or migrate (Black et al. 2011). However, it is not only the vagile organisms (e.g. Lyons 2003; Thomas and Lennon 1999) that migrate to cope with a changing environment. Case studies are accruing to suggest that in response to climate change plants either move upslope (Felde et al. 2012; Lenoir et al. 2008; Morueta-Holme et al. 2015) or shift their distribution ranges in the northern hemisphere northwards (Jump et al. 2009; Williams et al. 2004). Yet, we do not know to what degree woody plants can keep pace with the changing environment (Jump et al. 2009; Svenning and Skov 2007)—but see Williams et al. (2002). Of course, there are apparent ecological mechanisms, which allow plants to persevere even when outpaced by climate change such as temporary removal of enemies (Bakker et al. 2016), long relaxation times before local extinction (Halley et al. 2016; Halley et al. 2017; Jackson and Sax 2010), and the ability to maintain refuge populations outside their distribution range (Pearson 2006). Despite these mechanisms, we are aware that there are tree species, which have gone extinct in response to climate change in the last deglaciation (Jackson and Weng 1999). It is unlikely that all woody plants face an equal threat of extinction from climate change. Although existing models use plant traits, such as seed dispersal, longevity, and tree height, which could affect how trees cope with climate change (Morin and Thuiller 2009; Nathan et al. 2011), these are not always supported by existing empirical studies (Bertrand et al. 2016; Lenoir et al. 2008; Schwartz et al. 2006).

Much of our understanding of how the distributions of woody species have changed in response to climate originates from fossil data or historic records on tree distributions (Morueta-Holme et al. 2015; Williams et al. 2004). McLachlan and Clark (2004) argued that such fossil data, at least in the case study of North America, could have been somewhat inaccurate. An alternative way to study migration is through a focus on existing distributions of plant species to infer migration rates (e.g. Lenoir et al. 2008; Lenoir et al. 2009; Monleon and Lintz 2015; Woodall et al. 2009; Zhu et al. 2012). We know that global warming improves conditions at the northern edge and invites expansion to the north generating immigration credit (Jackson and Sax 2010) for warm-adapted species. However, at the southern edge, via the same mechanism, other competing species invade the range, generating supersaturation and extinction debt (Halley et al. 2016; Halley et al. 2017). There is no reason that these two processes proceed at the same rate. As a result, we find a higher proportion of older trees in the southern rather than the northern extreme of their ranges, which we can use as a diagnostic tool to assess migration (Lenoir et al. 2009; Woodall et al. 2009). Older trees are often bigger than younger trees (Niklas et al. 2003) and if we average many observations, we can use the resulting metric as a diagnostic tool for migration lag.

Recently, Mauri et al. (2017) published a report where they described the distribution ranges of 242 tree species in Europe. We reanalysed these data to address to what extent the traits of woody plant species predict the occurrence asymmetry of trees of “large” size (i.e. diameter at breast height (DBH) exceeding 12 cm) and “small” size (i.e. DBH below 12 cm) across their distribution zone. Addressing whether tree migration occurs with the specific dataset is desirable, in light of the fact that the majority of comparable studies are based on a single dataset from the United States Department of Agriculture (e.g. Fei et al. 2017; Iverson et al. 2008; Zhu et al. 2012). Obviously, there are numerous factors that can influence the size of the trees such as temperature and canopy state. For this reason, we addressed migration lag with two complementary approaches: (a) a direct approach where we only analysed tree records in the two latitudinal extremes of their distribution range on the assumption that these trees experience comparable fitness (i.e. here define in terms of phenotypes) and (b) a model-based approach where we could correct for the influence of temperature on plant growth and limited distribution information for some species. Combining these two approaches allows us to address the issue in a more robust fashion.

Because migration in trees only occurs via reproduction and death, we expect that migration of plants in response to climate change will be related to dispersal ability and longevity (Nathan et al. 2011). We focused on four specific traits: longevity, tree height, seed mass, and specific leaf area. Increasing longevity is expected to delay migration since trees on the trailing southern end take longer to die. Note also that longevity is a proxy for generation time and plants with a short generation time may allow rapid evolution (Franks et al. 2007), though we find it unlikely that woody species have enough time to adapt to the changing conditions (Lenoir et al. 2008). Dispersal is aided by greater tree height and hindered by increasing seed mass (Smith and Fretwell 1974; Turnbull et al. 2008). Finally, specific leaf area (SLA), which is a representative of the wordwide leaf-economics-spectrum (Wright et al. 2004), differentiates plants in terms of survival strategies. We expect an edge expanding northwards to be dominated by young small trees, while an edge contracting north will be dominated by old trees. Thus, for each species, we should observe a lower proportion of large trees at the northern extreme and a higher proportion on the southern edge. Our hypothesis was that tree species that have a high dispersal ability and a short generation time experience a lower migration lag than other species.

2 Materials and methods

2.1 Sources of data on tree distribution

For our analysis, we used tree distribution data from EU-Forest (Mauri et al. 2017). The dataset is a compilation of three datasets reporting location and size class of tree individuals in Europe and also contains information about the distribution range of each species. The dataset classifies trees with regard to their diameter into two size classes: those that exceeded 12 cm and those that did not. The core of the dataset originates from the National Forest Inventory (NFI) and contains records aggregated to an 1 × 1-km resolution. In total, the dataset contains 588,983 occurrence records for woody species spanning over 30 countries, out of which 558,983 contain diameter information. The publishers of the dataset (Mauri et al. 2017) calculated the occurrence envelops for individual studies and deemed that 2749 observations described species occurring outside their occurrence range. For each species, we could count the number of trees in both classes, also filtering out entries that lacked diameter information. Our expectation was that a low proportion of large trees were suggestive of many young individuals, whereas a high ratio implies mostly older individuals. We understand that there exists a wide range of factors other than age that may influence the diameter of a tree (Coomes and Allen 2006), so that large trees are not always older than small trees. Even under optimal growth conditions, there may be differences in the pace at which trees grow at their northern and southern distribution limits. Nevertheless, we share the expectation that, whatever of the environment, trees grow larger with age, so that diameter is usually correlated with age.

2.2 Sources of data on plant traits and invasive status

We collated trait data for our traits of interest from several databases, such as Ecological Flora of the British Islands, LEDA, and D3 (Fitter and Peat 1994; Hintze et al. 2013; Kleyer et al. 2008). Since some of these trait databases, such as LEDA (Kleyer et al. 2008), collate data from many different sources, we often encounter extreme trait values. These reflect either extreme habitats or possibly different standards of measuring the trait value. Thus, to achieve better robustness, we did not calculate average trait values but used the medians. We implemented a two-step procedure. First, because databases, such as LEDA, contained more than one trait observation per species, we extracted the median trait values per database and then we calculated the median observation across the three databases.

Not all the trees occurring in Europe are native and we wanted at a later stage to assess how our observations differed between native and invasive (non-native but naturalised) trees in Europe. To classify our trees to native and invasive to Europe species, we used the up-to-date database presented by Rejmánek and Richardson (2013). When we encountered species that were absent in that database, we searched for origin in the description of species in Wikipedia.

2.3 Approach one—direct approach

We worked with the two latitudinal extremes of the tree distributions. We assumed that trees in these two extremes experience relatively low fitness. Because DBH is indicative of both tree age (only adult canopy trees can have a diameter exceeding 12 cm) and fitness, we expected more or less equal proportion of large trees in these two extreme zones in the absence of migration.

To increase confidence in taxa distribution ranges, we worked with the subset of 105 tree species (Table S1) for which the northern and southern ends of their distribution fell within the 0.05 and 0.95 quantiles (i.e. buffer zones) of the data entries in the entire dataset. For our analysis, we compared size distribution data on trees that occurred either at the northern 5% of the latitudinal range or the 5% southern.

To correct for factors that varied across tree species, such as their size, we used the natural log response ratio of the proportion of large trees (relatively to the total number of stems with diameter data) in the northern strip over the southern strip of their distribution range. The log response ratio is one of the commonest metrics in meta-analyses, because it tends to correct for idiosyncratic differences between groups of observations (Hedges et al. 1999; Lajeunesse 2011). In our case, we refer to this ratio as migration lag:

$$ {D}_{\mathrm{L}}=\log \left(\frac{p_{\mathrm{N}}}{p_{\mathrm{S}}}\right) $$
(1)

where pN is the proportion of large trees in the northern strip and pS the proportion of large trees in the southern strip. A high negative value suggests that in the northern strip, there are many more young individuals than in the southern strip, which suggests immigration. By contrast, a value close to zero suggests that the proportion of large trees is equal between the two strips.

In our linear models, we used the product of the observations in the north and south extremes of the latitudinal extremes per species divided by two to weight entries. Our weighting scheme reflected (and assumes a quadratic increase with number of observations) the higher confidence that we had for estimates of migration lag originating from multiple observations, balanced between the two latitudinal extremes. Our aim was to assess the degree to which (the intensity of) migration lag could be predicted by traits of individual trees. We further repeated our analyses with the subset of trees that were classified as native.

2.4 Sensitivity analyses—direct approach

In the three trait databases we stated earlier, we could not find data for all tree species for all traits. In contrast to the other three traits, seed data can also be obtained via seed retailers. We thus sought seed mass information for some of the tree species we had excluded from further analysis. These are actually common seeds and six of the nine were available from one seed retailer. For the other species, we used median seed mass for the genus to which they belonged in LEDA.

To address the possibility that some influential observations were driving our results, we subjected the dataset to a jack-knife sensitivity analysis: We sequentially removed one of the tree species and assessed the P value of the correlation between seed size and migration lag.

To further address sensitivity issues, we reproduced our analysis after varying the extent of the extreme latitudinal strips and buffer zones: We used latitudinal strips ranging from 5% (default value) to 20% of the distribution range of the species. We did not experiment with smaller than 5% strips because these induced a large decline in our statistical power via the exclusion of woody species. We used three different buffer zones: 2.5% of the data entries, 5% of the data entries (default), and 10% of the data entries.

We additionally corrected for phylogenetic dependencies in our data via phylogenetic independent contrasts (Felsenstein 1985). We used Phylocom v 4.1 to prune the Phylomatic tree R20120829, so that it matches the species in our dataset. We then used the function BladJ to assign branch length according to published angiosperm node ages (Wikstrom et al. 2001). We present a Newick version of that tree in the form of Table S2. The phylogenetic independent contrast approach assumes a Brownian evolution model. We subsequently assessed the correlation strength between the corrected for phylogeny values of migration lag and seed size with a Kendall Tau correlation test.

2.5 Approach two—modelling approach

Since there are numerous factors that can influence the size of the trees, such as temperature and canopy state, it is helpful to address migration lag with an alternative approach. In the model-based approach, we could correct for the influence of temperature on plant growth and limited distribution information for some species. Since observations were not all fully symmetric with regard to the distribution range of the species and in some cases, the tree distribution zone was not resolved sufficiently (for example when the distribution zone extended outside Europe), we were aware that we would encounter smaller effect sizes with this approach. We extracted annual mean temperature information from WorldClim (Hijmans et al. 2005) at a resolution of 10′ latitude and longitude. We first fitted a binomial model to the individual records of each tree species with a sole continuous predictor temperature and the binary response variable tree size (i.e. 0: smaller than 12 cm, 1: larger than 12 cm). We then extracted the standardised residuals of this model. We used the standardised residuals because tree species should differ in the degree to which they are responsive to temperature. By using standardised residuals, we allowed for cross-species comparisons of residuals. We then fitted a general linear model with normally distributed errors that used the standardised residuals as a response variable and latitude as a continuous predictor. From this model, we extracted the slope, which was suggestive of the way the tree species responded to latitude after we have corrected for temperature differences across its distribution zone. At this stage, we aimed at removing cases where all entries with diameter information for any tree species occurred over a narrow strip of their latitudinal distribution. We shared the expectation that the slope in the cases, we had sufficient information by chance alone would be non-zero. This is why we introduced two inclusion criteria specific to each tree species:

  1. (i)

    there should be a minimum of five diameter observations per tree species;

  2. (ii)

    the slope was non-zero.

The second inclusion criterion was true for 54 out of the 150 tree species for which we had more than five observations of diameter. We included these 54 slopes into our linear model phase. We finally regressed the slopes (dependent variable) against trait information (either of the four traits). We weighted the regression with number of observations per tree species (Fig. S6).

Table 1 Cumulative characteristics for the 34 tree species for which we could assess migration lag with our direct approach based on our default settings. To determine invasive status in Europe we used data from Rejmánek and Richardson (2013). We marked entries where we had to further search for invasive state in other resources with a star and present information in Table S5

3 Results

3.1 Direct approach—relationship with traits

We had sufficient data to assess migration lag for 34 tree species (Table 1); values varied between − 1.5 and 0.85. Initial analyses showed that migration lag related poorly to longevity, tree height, and SLA (Fig. 1). By contrast, there was a negative relationship with (log transformed) seed mass (R2 = 0.13; P = 0.04; Fig. 1c). The relationship was stronger after retrieving missing seed information (R2 = 0.19, P = 0.009, Fig. S1) and across invasive trees (R2 = 0.28, P = 0.008; Fig. S2) than across native trees (R2 = 0.003, P = 0.84).

Fig. 1
figure 1

Relationship (direct approach) between migration lag and a tree longevity, b tree height, c seed weight (we log-transformed seed weight values), and d specific leaf area for the 34 plant species for which we had migration lag information. Only the relationship with seed weight in panel c was significant. We further repeated the analysis with additional seed weight entries (Fig. S1) and conducted a jackknife sensitivity test. Filled circles in grey stand for tree species that are invasive in Europe. Note that statistics are based on a weighted regression and that we did not use any means of depicting the weights of the entries in this figure

3.2 Direct approach—sensitivity analyses

The relationship was robust against a jack-knife sensitivity analysis with the sole exception being the exclusion of Quecus rubra. Varying the extent of the extreme latitudinal strips and buffer zones did not change our results (Table 2, Figs. S3 and S4). We detected phylogenetic signal only with regard to the trait seed mass (we log-transformed the trait values to linearize the relationship. The relationship between seed mass and migration lag was no longer significant after correcting for phylogeny via phylogenetic independent contrasts (Fig. S5).

Table 2 Sensitivity analysis of how the significance of the relationship between migration lag and seed weight changes when we alter the breadth of the northern and southern strips from the default value of 0.05 in our direct approach. We use these two strips to specify which trees have an extreme distribution. A narrow strip could limit the analysis to a few individuals whereas a wide strip may include trees which grow under less adverse growth conditions. The values are proportions of the total (latitudinal) distribution range. We only used strip breadth values exceeding 0.05 in this sensitivity analysis because we had very little statistical power in smaller strip breadth

3.3 Modelling approach

We present a scatterplot with the slope between latitude and corrected for temperature tree size against the number of replicates per plant species in the form of Fig. S6. We manually corrected a tree height record in the Ecological Flora of the British Isles for Picea sitchensis to 40 from 400 m. The results did not change irrespective of this correction. There was lower variance in the slopes for well-replicated trees. This justified our using of weighted modelling approach. We detected a negative relationship between slope values and seed mass (adjusted R2 = 0.09, P = 0.03)—Fig. 2. The relationships with the other three traits were not significant.

Fig. 2
figure 2

Relationship (modelling approach) between migration lag and a tree longevity, b tree height, c seed weight (we log-transformed weight values), and d specific leaf area for the 34 plant species for which we had migration lag information. Only the relationship with seed weight was significant. To address the relationships, we weighted tree species with their number of observations. The relationship was robust to alternative weighting schemes, such as a weighting with the square root of the number of observations

4 Discussion

Our empirical analysis links migration lag with specific plant traits. We have shown that tree species, which produce larger seeds, are subject to a stronger migration lag than small-seeded trees. We believe that this is because plant species that produce larger seeds have to compromise seed output (Turnbull et al. 2008), and this limits their dispersal potential into new environments. Our original expectations were that longevity would better explain migration lag. However, perennial plants do not possess discrete generations and upon maturity they can produce fruit for many successive years. Moreover, our data were collected during a time window shorter than one generation. Besides, terrestrial habitats are rarely saturated with plant individuals (Wilsey and Polley 2003), and new individuals can establish even when there are no deaths. Thus, on the timescale of this study at least, tree longevity poorly captures migration lag dynamics. As we argue in the introduction, a large migration lag may manifest susceptibility to global change. To this end, we believe that our results might facilitate future conservation efforts through highlighting taxa susceptible to global change. Furthermore, we believe that our analysis raises concerns about future sustainability of woody ecosystems. There is convincing empirical evidence that seed mass is a plant trait that mediates a partitioning of the plant niche (Ben-Hur et al. 2012). In case we selectively lose large-seeded tree taxa, we should expect alpha diversity in woodlands to decline which is likely detrimental for ecosystem functioning.

Many forest ecologists share the concern that in the literature we underestimate the importance and the pace of the migration of woody species polewards (Bertrand et al. 2016; Jump et al. 2009). An apparent reason for this is that methodological difficulties, such as limited availability of high quality data, have slowed the pace of studying migration of woody species polewards (Jump et al. 2009). Another cause is that plants and particularly woody plant species require time to equilibrate their distribution range with temperature conditions. The resulting mismatch between observed distribution range and at equilibrium temperature conditions, which we know as climatic debt, results in our observations underestimating the velocity of tree migration (Bertrand et al. 2011; Bertrand et al. 2016). Moreover, distribution ranges of trees tend to change at speeds that exceed our expectations based on inferred dispersal ability; this mismatch is known as Reid’s Paradox after Clement Reid who first reported it (Clark et al. 1998). Clark et al. (1998) suggested that this was because the distribution of tree seeds in space deviates from a Gaussian distribution by being “fat-tailed,” which implies that we tend to underestimate dispersal potential when using parametric tools. To this end, even the most convincing studies on migration potential, such as Bertrand et al. (2016) and Zhu et al. (2012), can only generate comparative estimates for the species and habitats they study. Finally, in the cases, we focus estimates of migration lag on subsets of common species with large distribution ranges; we could be underestimating the pace of migration and the risk of extinction if species with a limited geographic occurrence are more likely to experience more dramatic changes in their distribution range (Schwartz et al. 2006). In our analyses, we could only address effectively the final point we make: in our direct approach, we were more likely to include woody species with a small distribution range because their distribution range was less likely to overlap with the two extreme 5% quantiles of our entries, which represented one of our exclusion criteria. Despite the fact that we may be underestimating the pace of migration, we detected a substantial relationship between seed mass and migration lag. We thus believe that our results are equally straightforward to interpret.

Our analysis was subject to a number of assumptions. First of all, in the case of the direct approach, it might have been the case that, even in the absence of a migration lag, trees grew on average slightly bigger in the southern extreme of their distribution zone. The reason might be that there they experienced better climatic conditions but diseases due to factors such as drought and nutrient deficiencies (Veresoglou et al. 2014) prevented them from expanding any more southwards. In such a case, we could expect that all tree species faced in their northern extreme of their distribution a comparable disadvantage resulting in consistently negative migration lag values. Through our regression in our direct approach, however, we compared across relative estimates of DL across tree species, meaning that in the absence of migration lag, there should have been no relationship between DL and trait values. Another concern arises from the lack of information on the management history of the regions, where the data originated. Management history is a factor that can influence tree size considerably. Nevertheless, we carried out our analysis with the entire sets of tree species that met our inclusion criteria, which included both rare (direct approach) and abundant (modelling approach) tree species. Moreover, the distribution ranges of the trees we considered varied widely and offered a good coverage of Europe. We believe that as a result there were strong averaging effects for any particular practices that favoured larger or smaller trees across the entire continent. The only likely exception might have been management practices that are confounded by latitude such as the intensive forest management practices applied in Scandinavian forests (Östlung et al. 1997). In that case, however, we would expect that there would be a relationship when we used our direct approach between migration lag and the northern extreme of the distribution of the tree species which in our analysis was absent (tau = − 0.016, P = 0.87). We are thus convinced that our analysis depicts an actual relationship between seed mass of tree species and migration lag.

Our results were further supported with the modelling approach. Unlike the direct approach, we did not limit this analysis into the latitudinal extremes of the species distribution and that way we could work with a considerably larger pool of woody species. Having corrected for temperature sensitivity, we could narrow down the focus of the analysis to the proportion of variance that is explained exclusively by latitude. That way we could work with all records for tree species and not only the latitudinal extremes as we were forced to do in the direct approach. At the same time, though, the fit of the correction to temperature was crude and depended on the number of our observations. As a result, the relationships were weaker and, in the cases that the observations originated from narrow latitudinal strips, they were stochastic (Fig. 2). We present here both analyses and view them as complementary to each other.

We feel that there are several avenues to further build on the results presented here. An obvious next step might involve challenging our results with data from fossil records. We also think that case studies on how susceptible to extinction individual large-seeded tree species are would be desirable. Combining empirical and modelling data could greatly augment our ability to understand how distribution ranges of trees change in response to climate change.