Introduction

The commercial demand for woody biomass is expected to grow in the future. Fast-growing Populus species (poplars and aspens) grown in short- or medium-rotation forestry provide an alternative to biomass from conventional forests, as they contribute to efficient and sustainable land use and provide a flexible final product for use in material or energy production [1]. Nevertheless, to ensure sustained and stable productivity, we need to identify or breed for clones adapted to the conditions under which they are grown [e.g., 2, 3].

Phenology plays a pivotal role in survival and productivity of deciduous species, including Populus spp. While all aspects of plant activity are affected by growing conditions, leaf phenology defines the period during which carbon fixation and growth can occur. Leaf phenology is linked to net carbon uptake [4] and tree height increase [e.g., 5]: it is therefore highly correlated with biomass production. The timing of leaf phenology transitions [including bud break, bud set, leaf senescence and shedding; 6] needs to balance the risks of plant organs being exposed to harsh winter conditions, with the opportunity to exploit optimal conditions for carbon fixation [and hence growth and reproduction; 7, 812]. While conservative strategies (late bud break and early autumn phenology) can limit the exposure to damaging low temperatures [13, 14], such a risk-averse strategy reduces the length of the effective growing season and can curtail plant growth, competitive ability, and final biomass production. Knowledge of the response to photo-climatic conditions is thus key to determine the suitability of plant material to a specific location.

Climate change, along with elevated air CO2 concentration, can alter growing conditions, making currently optimal strategies a poor match to future conditions. Warming can advance spring phenology, but fluctuations in temperature will still be possible, including low temperatures that can damage the new tissues [e.g., 15, 16]. Warming is projected to be accompanied by more frequent unseasonal or extreme weather events. Indeed, the risk of spring damage by low temperatures has already paradoxically increased along with temperatures over the last century [17]. While temperature-related cues are expected to shift under climate change, other environmental cues such as photoperiod and additional light-dependent factors will remain stable, potentially constraining the ability of trees originating at lower latitudes to thrive at higher ones [18]. Photoperiod- and light-dependent factors are recognized as important signals for autumn phenology initiation [19], but they also moderate spring leaf out [10]. In general, the interplay between temperature and photoperiod cues can lead either to a reduction of the effects of climate change, because of photoperiod control on phenology [10, 14, 20]; or to a mismatch between the evolutionary-determined responses based on temperatures and local day length, which might reduce the productivity of currently adapted tree growth strategy. In Populus, clones of different provenances often exhibit large phenotypic variability and local adaptation [21, 22] and photoperiod-related cues to spring phenology are expected to be less important than in many other species, as common in early successional species [20, 23]. As a result, Populus spp. are exposed to potentially risky changes in phenology, like advancements in bud break. Projected climate change thus requires the evaluation of existing Populus clones to determine whether they are able to adapt to and benefit from the new conditions, while reducing their negative effects.

The need to evaluate existing clones and develop new ones is particularly acute at higher latitudes. Intensively managed Populus plantations grown in short rotation are increasingly important in Northern Europe, as a sustainable source of locally produced renewable biomass [24, 25]. But most Populus plant material was developed for Central and Southern Europe. With respect to these regions, higher latitudes are characterized by generally cooler temperatures, shorter summer nights, and faster changes in day length around the equinoxes. Plants not adapted to local conditions, such as clones of other origins, might not operate under the most productive strategy for leaf phenology [26].

The constraints imposed by low temperatures typical of Northern Europe are being eased by climate change faster than elsewhere [27], thus potentially making clones developed for warmer climates more suitable to the new local conditions. Some assisted migration experiments showed that northward transfers of clones have potential benefits for growth [5, 28], but it remains unclear whether existing clones are able to thrive in Northern environments, or whether new clones will be needed.

Beyond direct observations of leaf phenological stages in clones grown in specific locations, models can assess and compare phenological responses to growing conditions, and even predict the response of specific clones to altered conditions. A commonly used model for spring phenology is based on the concept of growing degree days (GDD), whereby specific phenological stages occur when thresholds of temperature sums are reached. The GDD model for spring phenology has performance similar to or better than other, more detailed, empirical and process-based models [29], among others in P. tremula L [30].

Here, we examine how different Populus clones react to climatic conditions—chiefly temperature and photoperiod. We investigate spring and autumn leaf phenology (and hence effective growing season length) and define the extent to which tree responses are predictable based on temperature and photoperiod over a latitudinal and climatic gradient. Specifically, we aim at answering the following questions:

  1. 1)

    Can the timing of leaf phenology of specific clones be predicted by growing conditions, in particular temperature and photoperiod?

  2. 2)

    If not, are the relative differences among clones conserved across years and sites? And which variables characterizing phenology are the most suitable to robustly rank the clones?

We answer these questions with a set of leaf phenological observations collected in common garden experiments in six sites in the Baltic region, in Northern Europe (Fig. 1; Table 1). The parameters of a Weibull-type curve describing the observed phenology and the threshold temperature sum of the GDD model are used to compare and predict the response of clones to photoperiod and climatic conditions.

Fig. 1
figure 1

a Site location and b long-term (1950–2019) weekly average temperatures of the six sites in the Baltic region, in Northern Europe. The shaded areas in (b) correspond to the periods, expressed as days of the year (doy), on which Supplementary Information (SI) Fig. S1 and S2 focus

Table 1 Summary of site features and data availability for the six Populus common gardens, located in Sweden (SE), Latvia (LV), and Lithuania (LT), along a latitudinal and climatic gradient. More information on the climatic conditions and plant material are reported in the SI (Section S1 and S2 respectively)

Methods

Data

Site Information

Phenological data on bud break and shedding were collected in six sites in Northern Europe, in Sweden, Latvia, and Lithuania (Table 1), during 2017 and 2018. The sites differ in latitude and climate (Fig. 1).

Furthermore, at such high latitudes, the differences in photoperiod are large, despite the sites spanning just 4.4° of latitude (Table 1). Northernmost sites have longer days in the spring and shorter in the autumn than more southern sites. Differences in photoperiod become larger later in the spring and autumn, i.e., further away from the equinoxes (Supplementary Information, SI, Fig. S1).

For each site, daily average temperatures and daily precipitation totals were extracted from the EObs gridded dataset with resolution of 0.25 ° × 0.25 °, starting in 1950 [31]. The accuracy of the dataset depends on the number and spatial distribution of the underlying meteorological stations used for the interpolations. Data relative to the Swedish sites are based on a dense meteorological station network; the coarsest station distribution occurred in Lithuania. Yet, temperatures, and in particular average daily temperatures, are generally relatively homogeneous in space even over tens of kilometers in the absence of elevation changes, which gives confidence in the temperature data used for the analyses.

As expected, long-term mean temperatures are not described by latitude alone. Climate is more continental in the Latvian and Lithuanian sites (Ludza; and Šašaičiai and Anykščiai, respectively), with colder winters and warmer summers than in the Swedish sites (Krusenberg, Remnigstorp and Våxtorp; Fig. 1b). Spring temperatures are on average higher in Ludza and Anykščiai, than in Šašaičiai and the Swedish sites. Conversely, autumns are coldest in Ludza and warmest in Våxtorp, with the other sites exhibiting similar long-term mean temperatures, despite the differences in latitude (Fig. 1a) and hence photoperiod (SI, Fig. S1). The weather conditions also differed between the two sampling years. In all sites, the year 2017 was similar to the long-term mean, while 2018 was abnormally warm and dry, particularly during the spring and summer (SI, Fig. S2 and S3). The conditions for 2018 are expected to be the new norm in the region by mid-century [32].

Phenological Data

Data on leaf phenology were collected in spring and autumn for the 1 or 2 year depending on site and season (and is available upon request). The plant material sampled included about 160 Populus clones in total and is broad regarding origin and phenological characteristics. The majority is P. trichocarpa Torr. & Gray, with germplasm of almost the entire geographical range of this species. Nevertheless, sites differed in clones represented, tree age, and experimental design (see SI Section S2 for some more details). Five clones, chosen among those with high performance in Krusenberg, were common to all sites; five additional clones occurred in three sites of similar latitude (Våxtorp, Šašaičiai, and Anykščiai; see SI Section S2 for information on these 10 clones).

Spring phenology was assessed by scoring bud break on a scale from 1 to 5, with stage 2 corresponding to initial shoot emergence, stage 3 to leaf primordia exposed, stage 4 to leaves half open with bud scale dropped, and stage 5 to leaves completely open. In the autumn, leaf coloring was scored on an 8-point scale, from completely green to completely yellow; whereas leaf shedding was scored on 1 to 3 scale, where 1 corresponds to full foliage, 2 to half, and 3 to full defoliation. There was some overlap between color change and leaf shedding, as leaves can change color and even fall while there are still some green leaves present. The frequency of spring and autumn scoring of phenology varied largely from site to site and from year-to-year, resulting in 0 to 10 scores per year and season (Table 1), with scoring every 3 to 7 days in most cases, but with some intervals up to 15 days.

Leaf Level Properties for Selected Clones in One Site

Beyond phenological and climatic data, in one site (Krusenberg), we also measured leaf gas exchange and characterized leaf chlorophyll content and specific leaf area for four clones with similar origin but differing in their spring phenology. Details on the measurement protocol are reported in the SI, Section S4. These clones were selected to represent the “fastest” and “slowest” among those in the site; three of them are common to all sites (SI, Section S2). The goal of these measurements was to check whether co-variation of some key plant traits with leaf phenology patterns are to be expected and could, in principle, explain differences in plant performance.

Characterization of Spring and Autumn Phenology

We characterized the observed spring and autumn phenology by means of different variables, and specifically the timing and speed of changes in the phenological scores in spring and autumn, and the thermal conditions needed to reach a certain phenological score in the spring. These variables are obtained by fitting a saturating curve to the observed phenological scores (“Fitting of Leaf Phenology Scores” section) and by means of the growing degree day (GDD) model (“Spring Phenology Model” section) respectively. The fitted phenological scores are also used to determine the length of the effective growing season (“Definition of the Effective Growing Season” section). All the fitted parameters are available as Supplementary Data.

Fitting of Leaf Phenology Scores

In some cases, in particular in the sites where screening frequency was low or during periods of fast development, information on the exact day in which specific phenological stages were reached is missing. To allow the estimate of the timing of each stage, a saturating Weibul-type curve was fitted to the observed scores:

$$ {s}_i(t)=1+\frac{s_{\mathit{\max},i}-1}{1+\mathit{\exp}\left[-{k}_i\left(t-{t}_{50,i}\right)\right]} $$
(1)

Here, smax, i is the maximum score in the scale, t50, i is the day of the year (doy) at which the intermediate score is reached; ki is the slope of the curve at t = t50, i; and the subscript i refers to the phenological event under consideration (i = bb for bud break; i = ls for leaf shedding; i = col for leaf coloring). The advantage of the time dependence in Eq. (1) is that it requires the fitting of just two parameters, t50, i and ki. In addition, these parameters have a clear meaning: respectively, the timing of the intermediate phenological score, and the speed of change around that time and score. Importantly, the curve does not allow a regression in time of phenological stage, which would be unrealistic. Here, the two parameters are estimated by least square fitting across all individuals of the same clone in each site and year. By considering all observations for a clone, i.e., over multiple individuals, more robust estimates are obtained, at the cost of losing any measure of within-clone variability. Examples of spring (top) and autumn leaf shedding (bottom) scoring and the fitted curve are reported in Fig. 2 for two clones markedly differing in phenological response to growing conditions.

Fig. 2
figure 2

Example of phenological data (red symbols and whiskers) and fitted curves (black lines) for a, b) bud burst scores and c, d) autumn leaf shedding scores for two clones, and their evolution in time (day of the year, doy), for Krusenberg in 2017. a, c) refer to a clone with generally late spring phenology (44.13) and c, d) to a clone with generally early spring phenology (722.16). Symbols correspond to averages across all individual trees of the clone present in the site; whiskers extend over the average ± the standard deviation across the individual trees

Due to the difficulties inherent in scoring, in particular of larger trees, in some circumstances, scores decreased at subsequent observations. These unrealistic observations were removed, under the assumption that the subsequent (lower) score was not correct, and the corresponding observation considered as missing for the purposes of the fitting. Also, we restricted the fitting to the cases in which realistic observations were available for at least three dates, including at least six data points from a minimum of two individuals of the clone. We further discarded all the fitted curves with a coefficient of determination lower than 0.2. These criteria led to the number of fitted curves change from year-to-year in some sites (as apparent from the values reported under the box plots in Figs. 3 and 4 below). Yet, the results reported below do not appreciably change should other approaches to data cleaning be employed.

Fig. 3
figure 3

Summary of fitted spring and autumn phenology parameters and resulting duration of the effective growing season (for the six sites during 2017 or 2018 or both, depending on data availability). For spring phenology, a) t50, bb, i.e., doy at which score 3 was reached; b) kbb, i.e., rate of change of score at score 3. For autumn phenology, c) t50, ls, i.e., doy at which leaf-shedding score 2 was reached; d) kls, i.e., rate of change of score at leaf-shedding score 2. The resulting duration of the effective growing season (from spring score 3 to autumn score 2) is reported in e). From left to right, pairs of bars refer to the six sites, as per the x-axis labels. Colors denote year: 2017 in gray, 2018 in red. Thick horizontal lines correspond to the median, boxes extend from the 25th to the 75th percentile, and whiskers cover the 1st to 99th percentiles. Stars denote significant differences (p < 0.001) in the median values between 2017 and 2018 in each site, based on a paired t test run on the log-transformed variables. Values in parenthesis at the bottom of each bar denote the number of data points included in the boxplot, i.e., the number of clones for which adequate data were available and the fitting returned robust results (see the “Fitting of leaf phenology scores” section for details). As such, this value can be lower than the number of clone grown in the site and varying depending on year and whether spring or autumn phenology or growing season length is considered

Fig. 4
figure 4

Summary of growing degree days at which spring phenology score 3 (\( {GDD}_3^{\ast } \)) was reached for the six sites during 2017 or 2018 or both, depending on data availability. Colors and symbols have the same meanings as in Fig. 3

The fitted parameters t50, i and ki (i = bb, ls) were used as indicators of clone-specific response to the environmental conditions, relative to each site and year. Leaf shedding was preferred to leaf coloring as indicator of autumn leaf phenology, because Populus is often photosynthetically active until very late in autumn and certainly after height growth cessation and the initiation of leaf coloring [e.g., 33]. Thus, height growth cessation in autumn initiates frost-hardening in shoot meristems to safeguard the growth that has already been achieved, but some green leaves are retained until very late in autumn. These leaves are still capable of opportunistic but substantial carbon assimilation and carbohydrate translocation from leaves even after hard frost [33], resulting in biomass increase despite no increase in height. Furthermore, scoring relative to leaf coloring appeared particularly prone to unrealistic observations and low coefficients of determination. Nevertheless, the timing of the intermediate score for coloring and shedding were correlated (SI, Fig. S4).

Definition of the Effective Growing Season

The effective growing season length was defined as the time between the occurrence of the bud break score 3 (i.e., leaf primordia exposed) and the leaf shedding score 2 (i.e., half defoliation). The rationale for choosing the intermediate score for both spring and autumn phenology is to match the focal parameters of Eq. (1). Since in most cases these scores were not directly observed, their time of occurrence was calculated by inverting Eq. (1) after fitting.

The resulting effective growing season length relative to each clone, site, and year is assumed to be a proxy of the clone fitness under the corresponding growing conditions.

Spring Phenology Model

Spring phenology has been often modeled based on the concept of growing degree days (GDD), determined as

$$ GDD(t)=\sum \limits_{t={t}_0}^t\mathit{\max}\left(0,T(t)-{T}_b\right) $$
(2)

where t is time (expressed as doy), t0 is the first doy on which the GDD are accumulated, T(t) is the average daily temperature for doy t, and Tb is the base temperature below which no GDD are accumulated. A specific phenological stage occurs when the accumulated GDD reach a threshold, i.e., GDD(t) = GDD. Here, the focus was on bud break score equal to 3, i.e.,\( {GDD}_3^{\ast } \).

The base temperature, Tb, was set to 5 °C—a standard value, used also for Populus [34]—but similar conclusions can be reached should other realistic values be considered. Indeed, the choice of Tb can affect the numerical values of GDD(t) and GDD [35], but, within realistic base temperatures, such changes do not affect the time at which a certain phenological stage is reached, t. The starting date t0 was set arbitrarily to January 1st (i.e., doy = 1), i.e., well before temperatures in the region are normally above the chosen Tb.

Here, the GDD model was used in diagnostic mode, i.e., to determine whether it is suitable to robustly and consistently describe the observations, across sites and clones. To this aim, we determined the GDD at score 3, \( {GDD}_3^{\ast } \) for each site, year and clone. Based on the previously mentioned performance of the GDD model for spring phenology, it was expected that \( {GDD}_3^{\ast } \) would be clone- but not year- and site-dependent, thus making the GDD model a robust tool to predict the timing of leaf phenology of the different clones, as a function of the year- and site-specific temperatures.

Statistical Tests

The analyses focused on different aspects characterizing spring and autumn phenology and duration of the effective growing season, and specifically: the fitted parameters of Eq. 1 for spring (t50, bb and kbb) and autumn (t50, ls and kls); the resulting length of the effective growing season; and the accumulated growing degree days at spring phenology score 3 (\( {GDD}_3^{\ast } \)).

To determine whether there were significant differences from year-to-year, the above variables were first analyzed site by site, by means of a paired t test. The dependent variables were log-transformed prior to analysis, to focus on the differences across sites.

To ascertain the role of latitude (and hence photoperiod), the same dependent variables were also tested focusing only on the five clones common to all sites. An ANOVA was performed on the log-transformed variables, with latitude and clone as fixed effects, and year as random. No interactions were considered due to the limited number of data points. Furthermore, to determine whether each clone had a similar response independently of site and year, a Friedman rank test was performed, with clone as treatment and adjusting for site. The test was performed and the ranking obtained considering years separately; and, for spring phenology and four sites, also considering 2017 and 2018 as replicates.

Finally, for more robust conclusions on the role of clone in defining the response to local climatic conditions, the Freedman test was repeated on the 10 clones common to three sites, differing in climates by not in photoperiod.

The test assumptions were visually checked. All the statistical analyses were performed in MatLab 2018a (the MathWorks Inc., Natick, MA, USA).

Results

Within-Site Comparison Across Years

When compared with 2017, spring 2018 resulted in an earlier (i.e., lower t50,bb; Fig. 3a) and faster (i.e., higher kbb; Fig. 3b) spring phenology across all comparable sites. While autumn phenological observations were unavailable from several sites, trees in Krusenberg had an earlier (i.e., lower t50,ls; Fig. 3c) and faster (i.e., higher kls; Fig. 3d) leaf shedding during 2018 when compared with 2017. This year-to-year variability suggests annual variation in climatic conditions has an effect also on autumn phenology. In this site, the combined changes in spring and autumn phenology resulted in a small (3 days on average) but significant shortening of the effective growing season length from 2017 to 2018 (Fig. 3e).

The growing degree days cumulated at bud break reference score 3, \( {GDD}_3^{\ast } \), was larger in spring 2018 than 2017 (Fig. 4). Similar patterns were observed should scores other than 3 be considered (not shown). Hence, \( {GDD}_3^{\ast } \) cannot be considered a clone-specific parameter independent of year—a property necessary for using the GDD model, which relies only on temperature, to predict the timing of spring phenology.

Across-Site Comparison and Clone Ranking

To explore the consistency of clone responses across sites and years, we focused on the five clones common to all sites (SI, Section S2). Latitude, clone, and year affected the timing of spring phenology, t50, bb (Table 2), with delayed spring phenology at higher latitudes and in the cooler year (i.e., 2017; Fig. 5a, left). The speed of spring phenology kbb was slower and the threshold for the phenological event \( {GDD}_3^{\ast } \) was lower in 2017 than in 2018 (Table 2; Fig. 5b and 6, left); they both were also affected by clone (Table 2). The timing of autumn phenology, t50, ls, depended on latitude, with earlier leaf shedding at higher latitudes (Fig. 5c), while clone was only marginally significant. The speed of autumn phenology, kls, was not explained by the factors considered. As a result, the length of the effective growing season was affected by latitude only, with longer growing seasons at lower latitudes (Fig. 5e, left).

Table 2 Summary of the ANOVA of log-transformed variables assessed in the six sites, with latitude and clone as fixed effects and year as random. The variables are: t50, bb, i.e., doy at which score 3 is reached; kbb, i.e., rate of change of score at score 3; t50, ls, i.e., doy at which autumn score 2 is reached; kls, i.e., rate of change of score at autumn score 2; the length of the effective growing season; and \( {GDD}_3^{\ast } \), i.e., the accumulated GDD at spring score 3
Fig. 5
figure 5

Spring and autumn fitted phenology parameters (a) t50, bb; b) kbb c) t50, ls; d) kls) and e) duration of the effective growing season) for the clones common to all the sites (left panels) and the additional five clones common to just three sites (right panels). When available, values for 2017 are reported with open symbols shifted to the left; those for 2018 by closed symbols shifted to the right. Symbol shapes allow distinguishing among sites

Fig. 6
figure 6

Growing degree days at which spring phenology score 3 is reached (\( {GDD}_3^{\ast } \)) for the clones common to all the sites (left panel) and the additional five clones common to three sites (right panel). Colors and symbols have the same meanings as in Fig. 5

To further investigate if the different clones responded consistently across sites, we determined the mean ranking of the five clones common to all the sites and of the 10 clones common to three sites (Våxtorp, Šašaičiai, and Anykščiai). These three sites span only 1 ° of latitude (i.e., have similar photoperiod; Fig. S1) and were characterized by rather similar temperatures in 2018 (Table S1, Fig. S2). We tested whether phenology was significantly different among clones, based on the Friedman test (Tables 3 and 4). Across all the sites and the 2 years, the five clones could be consistently (p < 0.001) ranked based on the timing of spring and autumn phenology (t50, bb and t50, ls respectively); the ranking of the effective growing season duration was consistent only at p = 0.05 (Table 3). Among the clones, 722.16 underwent the earliest phenology both in spring and autumn, so that its effective growing season duration was intermediate among the common clones. For the other clones, the ranking was not consistent across spring and autumn (with, e.g., clones undergoing bud break early but shedding their leaves late, or vice versa), although the latest clone in exhibiting spring phenology (44.13) was also the one with the shortest duration of the effective growing season. The remaining clones had a rather similar behavior both in spring and autumn. The ranking in \( {GDD}_3^{\ast } \) largely followed that of t50, bb. The same variables could be consistently and robustly ranked for the 10 clones common to the three sites in 2018 (Table 4), but with a higher level of significance for the growing season duration (p = 0.014). Hence, the proposed approach of ranking clones to ascertain relative differences appears robust also when considering clones grown in sites with similar photo-climatic conditions.

Table 3 Summary of the ranking of the five clones common to all sites, based on the Friedman test. The lower the ranking, the lower the value of the corresponding variable for that clone. When the 2 years are combined, year is used as replicate. Meaning of variables is the same as Table 2
Table 4 Summary of the ranking of the 10 clones common to three of the sites (Våxtorp, Šašaičiai, Anykščiai), as per the Friedman test of the variables assessed in 2018. The lower the ranking, the lower the value of the corresponding variable for that clone. Meaning of variables is the same of Table 2

These differences across clones and years in timing of phenological events were not accompanied by differences in key leaf-level properties (light-saturated CO2 assimilation rate, chlorophyll content, and specific leaf area; Fig. S5 and S6), even when considering clones with large differences in phenology (e.g., clones 44.13 and 722.16).

Discussion

Methodological Considerations

We focused on leaf phenology and effective growing season duration and how it can change in the future, because the length of the effective growing season is proportional to annual height growth [5, 36,37,38,39]. Furthermore, there was no clear link between timing of spring phenology and leaf traits in selected clones in one site (Krusenberg); rather, there was a larger variability across individual trees and clones (SI, Section S4). As such, focusing on leaf phenology and the effective growing season length as determinant of the potential productivity appears appropriate, at least for these clones. Nevertheless, a correlation of these leaf traits with the timing of some phenological events has been observed along a geographical gradient in P. trichocarpa [37]. And higher light saturated CO2 assimilation rate and lower specific leaf area were observed in Populus clones adapted to higher latitudes [40, 41] and speculated to be an adaptation to shorter growing seasons [42]. Should this be the case, differences in effective growing season length could lead to smaller than expected differences in potential productivity, although the two are correlated [5, 36,37,38].

For autumn phenology, we focused on leaf shedding as opposed to leaf color change, because any remaining green leaf is capable of substantial opportunistic carbon assimilation, supporting biomass growth [33]. Indeed, leaf shedding correlates well with biomass growth [43,44,45]. Furthermore, leaf shedding scores led to a more robust fitting than leaf coloring scores. Nevertheless, time of leaf color change and shedding were largely correlated (SI, Fig. S4). Hence, while the choice of leaf shedding can affect the duration of the effective growing season, our main conclusions in terms of drivers of the clone ranking remain unaltered.

The above analyses and conclusions focused on the fitted parameters of Eq. 1. The advantages of looking at the model parameters instead of directly at bud break dates are that they allow exploiting also low-frequency observations and capturing different aspects of the phenological development. In particular, the parameters of Eq. 1 permit partially separating the timing and speed of phenological development, which can be differentially affected by latitude and thermal conditions. Indeed, the speed of phenological development was largely independent of latitude and clone, while the timing of intermediate phenological stage was affected by these aspects (Table 2, Fig. 5).

Also the use of the GDD model aims at partially disentangling the different aspects affecting spring phenology, specifically removing the effects of thermal conditions by transforming elapsed time into a more meaningful quantity under the plant physiological point of view. We chose the GDD model because of its simplicity and hence limited data requirements for parameterization. The model has previously led to good performance, often exceeding that of more complex models [29, 30]. We did not consider chilling requirements because their inclusion has not improved model performance [30]. Furthermore, while accumulation of chilling units can affect the timing of bud break, with higher chilling units leading to lower GDD* [46, 47], Populus spp. have been shown to accumulate the required chilling time by January at similar latitudes but in warmer climates [46] than the six common gardens, or be largely insensitive to accumulated chilling units [48]. Even when chilling affected bud break, the extent of such effect was independent of clone provenance [22], thus bearing no consequence on the ranking on clones within each location.

Our dataset allows partially disentangling the role of temperature (and its yearly variability) and latitude. For example, Remningstorp and Šašaičiai were characterized by similar temperatures (on average and during the sampling years; Fig. S2), but are located at different latitudes, and hence differ in photoperiod (Fig. S1). Nevertheless, the low sampling frequency in some sites and low replication of some clones can reduce the robustness of our conclusions.

Site-, Year-, and Clone-Specific Response of Leaf Phenology

Within each site, spring phenology occurred earlier and faster in the warmer of the 2 years (2018). Temperature is generally recognized as the main cue for spring phenology in Populus [22, 49, 50], although its effect on spring phenology is modified by clone response to day length, in line with results for other species in Europe and North America [10, 11, 51].

Leaf shedding occurred earlier and faster in 2018 than 2017, in the site for which this analysis was possible (Krusenberg; Fig. 3d). These differences might be ascribed to warmer autumn temperatures (Fig. S2 and Table S1), or the dry conditions (Fig. S3) that accompanied the high temperatures during most of summer 2018, or their combinations. Both warmer temperatures and dry conditions can have contributed to an earlier bud set and autumn phenology [14, 18]. Indeed, late summer drought can lead to premature senescence [52]. This year-dependence of autumn phenology is in contrast to the general understanding that growth cessation and autumn phenology initiation are mostly cued by changes in photoperiod or in the light spectrum at the origin site [21, 53]: because these conditions depend on latitude, they are stable from year-to-year. Nevertheless, while warmer-than-normal temperatures shall not affect the initiation of leaf senescence, its speed is temperature-dependent at least in some species [54]. The parameter kls directly characterizes the speed of stage transition around the intermediate stage, but also t50,ls can be affected. Yet, there was no clear relation between kls and t50,ls (not shown), suggesting an effect of growing conditions on both parameters. Another potential mechanism could be the observed correlation between bud break and autumn senescence emerging at the regional scale, with earlier (later) senescence associated to earlier (later) spring phenology [55]. Year 2018 was expected to have a longer effective growing season than 2017 [2], because of the overall higher spring temperatures and considering the general lower dependence of autumn phenology on growing conditions. Nevertheless, the combined effects of warmer spring, drier and warmer summer, and warmer autumn led to a shorter effective growing season in 2018 than in 2017.

Comparisons among the five common clones across sites showed that latitude and year affected the timing and speed of spring phenology as well as the timing (but not the speed) of leaf shedding (Table 2). This points to a role of latitude (and hence light-related cues) on autumn phenology, as expected. However, due to the limited data availability for 2017, it is not possible to robustly disentangle the effects of latitude and temperature on leaf shedding. Qualitative comparisons between sites with similar temperatures in 2018 but different latitudes (Krusenberg and Remningstorp vs. Anykščiai; and, to a lesser extent, Våxtorp vs. Šašaičiai) suggest an earlier leaf shedding in the northernmost site, whereas the difference in speed was smaller (Fig. 5c, d). While results in the literature are equivocal [18 and references therein], we find evidence that higher temperatures resulted in earlier leaf shedding between sites at similar latitudes (Våxtorp vs. Ludza). Hence, autumn phenology appears determined by a combination of cues, with the primary influence given by photoperiod, but mediated by climate variables including temperature and rainfall.

The GDD corresponding to the intermediate spring score, \( {GDD}_3^{\ast } \), also appeared dependent on year, both when considering each site in isolation (Fig. 4) and the common clones across all sites (Fig. 6; Table 2). Furthermore, when focusing on the five common clones, \( {GDD}_3^{\ast } \) was affected by latitude and clone (Table 2). GDD-based models have been used extensively to simulate phenological events like spring phenology [29, 30] under the implicit assumption that the GDD requirements for a specific score (GDD) does not depend on temperature and often latitude. The model performance is generally satisfactory [29] and exceeding those of more detailed models [29, 56], although some limitations of this modeling approach have been observed [2, 35, 57]. In particular, GDD decline with day length; in warm springs, when temperatures become high already when days are still relatively short, GDD is higher than in cooler springs when days warm up later in the year [10]. This expectation is in line with our results, with higher GDD in 2018 than in 2017 (Figs. 4 and 6). When comparing the same clones grown in different locations, GDD increased with decreasing latitudes (Fig. 6), where days were shorter at the occurrence of the phenological event (Fig. S1): this was the result of both differences in photoperiod across locations, with shorter days at lower latitudes, and warmer temperatures and hence an earlier bud break (i.e., lower t50, bb). So, while the emerging pattern in GDD agrees with recent observations for other species, the lack of a GDD that is clone-specific but independent of latitude and temperature makes it more difficult to predict the timing of spring phenology for a specific clone, as a function of current or future climatic conditions. This is a major limitation to the use of the GDD model in prognostic mode, e.g., to project the timing of a specific phenological stage under altered climatic conditions [51]. Nevertheless, data from the clones common to several sites (Fig. 6) suggest that GDD allow a very robust ranking of the clones, also across sites (and hence latitudes) and years. As such, GDD obtained for a set of clones grown in one location can help to pin point relative differences among them—differences that we showed are retained when considering other photoperiods and climatic conditions. Also the timing of intermediate phenological scores (t50, bb) allows a consistent ranking of the clones, independently of location and year, but GDD magnifies the differences (compare the ranges in Figs. 5a and 6). Hence, GDD can be used to select appropriate clones for growth in novel conditions because the phenology ranking based on a single site is maintained under other growing conditions.

Implications for Clone Selection in the Face of Climate Change

Most existing Populus clones currently used for biomass production in Europe were developed for current conditions in Southern latitudes. Adapting biomass production to future climates at higher latitudes requires determining the response of existing clones to altered climatic conditions and light features at these latitudes. We found that warmer conditions result in earlier (and faster) bud break and earlier (and faster) leaf senescence (Fig. 3a–d). Regarding spring phenology, there are also indications that warmer temperatures enhance the required GDD to achieve a certain phenological stage, because of the influence of day length on spring phenology, thus effectively reducing the possibility of further advancement. This day length-induced restriction is less severe at higher latitudes, underlining once more the need for clones adapted to the combination of climatic and photoperiodic conditions to ensure low risk of late occurrence of damaging low temperatures and higher productivity.

Spring and autumn phenology affects net carbon uptake [4] and overall tree growth [5], with potential local and global implications on carbon cycling [e.g., 58]. Warmer temperatures, while resulting in an earlier bud break, can also enhance the speed of autumn phenology and possibly alter its timing [55]. Hence, the expected increase in temperature under climate change does not necessarily translate in a longer effective growing season and enhanced potential for growth. Furthermore, while warmer temperatures can in principle enhance the leaf photosynthetic capacity [59, 60], warming temperatures will act on both gross CO2 assimilation and respiration, potentially with little effect on net CO2 assimilation rate. Furthermore, Populus species generally have wide optimal ranges of temperatures for assimilation and limited thermal acclimation response [61,62,63]. Finally, warmer air temperatures can lead to stomatal closure as the result of higher vapor pressure deficit, and lower plant water availability, even under unaltered precipitation. It follows that the direct effects of global warming on potential carbon fixation rate at least partially cancel out, so that phenological response remains key in defining the success of a specific clone. It is thus necessary to develop approaches to evaluate the net response of clones to climate and photoperiod across a wide range of latitudes. Ideally, such evaluation should not require direct phenological observations from a large number of locations and climates, but rather build upon those relative to a restricted subset of geographic locations and climates.

The GDD model appears to have limited predicting capabilities for spring phenology, because of the latitude- and year-dependence of the GDD threshold for specific phenological stages. This is in contrast with previous conclusions that the GDD model is suitable to effectively predict the spring phenology of many species and its widespread use, also for Populus spp. [34], but in line with recent experimental observations relative to deciduous species across Europe and North America [10, 51]. Thus, the GDD model might not be adequate to predict the timing of occurrence of spring phenology, in particular when aiming at capturing the (often small) differences among related clones. The positive results of applying the GDD model to predict spring phenology [29, 30] suggest that this approach can discern species, where differences in timing of bud break are larger, instead of closely related clones, as it was the case here. Nevertheless, our results show that clone responses to seasonal cues were consistent when determined based on GDD (Tables 3, 4), across sites, years, and origins. Hence, the GDD model, while not fully adequate to predict the timing of phenological events, can be used in diagnostic mode, for a robust ranking of clones as early or late, or demanding low to high cumulated thermal time for spring phenology. The ranking of clones based on GDD follows that based on the timing of phenology (i.e., t50, bb; Tables 3, 4), but GDD amplifies the relative differences among clones. As such, the GDD model, and the parameter GDD in particular, can provide a tool to screen clones for their relative response to spring conditions, needing observations from few sites or years. Knowledge of the relative differences among clones as emerging from one site can thus be used to predict the relative differences among those clones planted in other sites (i.e., under different photoperiods and climates) or climatic conditions (including future ones). Because of the high heritability of bud break [64], this approach can support the choice of locally adapted species. Also the time by which 50% of the leaves are shed in the autumn allows ranking species, but the ranking appears less robust, and we could not test whether that was conserved across years.

Conclusions

Both latitude and temperature contribute in defining spring and autumn phenology in Populus spp. While the causal roles of temperature in spring phenology, and latitude/photoperiod in autumn phenology are well recognized, the effects of photoperiod in the spring and temperature in the autumn are often overlooked. We show that latitude affects spring phenology, and as such, the simple growing degree days (GDD) model cannot be used as a reliable predictor of the specific timing of phenological stages based on observed or forecasted temperatures. Nevertheless, the GDD model, and in particular the cumulated GDD necessary to achieve a specific phenological stage, allows a robust ranking of early and late clones, which is conserved across years and sites. As such, it provides a tool to predict the relative differences among clones when grown under different photoperiods or climatic regimes, based on observations from a single site. While not prognostic in the sense of determining the timing of phenological stages in a specific location and weather, this tool can support the relative evaluation of clones of different genetic background, based on a limited amount of observations.