1 Introduction

Tree-ring sequences from various proxies (including tree-ring width, TRW) are widely used to reconstruct climate variability over recent millennia, because tree-rings can be accurately dated and such data can be obtained from widely distributed geographical regions. When statistically significant relationships are found between a tree-ring chronology and instrumental meteorological variables, a climate reconstruction using this relationship can be made, which may be reproducible among sites within a region (Fritts 1976; Babst et al. 2018). Much of the evidence regarding variations in hemispheric and global temperatures during the last millennium come from tree-ring data (Mann et al. 1999, 2008; Briffa et al. 2001; Esper et al. 2002; Moberg et al. 2005; Shi et al. 2015; Wilson et al. 2016; Anchukaitis et al. 2017). These records show variability on decadal to multidecadal timescales that is broadly consistent for different hemispheric reconstructions (Christiansen and Ljungqvist 2017; Neukom et al. 2019). However, the low-frequency variability (i.e., the centennial scale) is still poorly constrained. For example, the differences between different hemispheric reconstructions are still so large that we cannot distinguish the common era during the Medieval Climate Anomaly/the Little Ice Age (Fernández-Donado et al. 2013).

One challenge associated with tree-ring-based climate reconstructions the difficulty of separating trends caused by tree growth evolution and low-frequency climate variability. Indeed, the retention of low-frequency (i.e., centennial-scale) information remains a critical source of uncertainty in past climates reconstructed from tree-rings (Briffa et al. 2001; Esper et al. 2002; Gunnarson and Linderholm 2002; Björklund et al. 2013; Franke et al. 2013; Yang et al. 2014). The tree growth trend is usually defined as an age-dependent, long-term trend caused by the ontogenetic changes of tree geometry (Fritts 1976; Briffa and Melvin 2011; Peters et al. 2015). Forestry measurements have shown an ideal growth curve can be theoretically represented by a juvenile phase of rapid increase in TRW lasting for a few years (in some case persisting for more than a decade) of a tree’s life, presumably as the canopy expands and the tree emerges from the shade of the neighboring trees. This is followed by a slowing down of growth once the tree reaches maturity, when the smaller annual tree-ring increments are related to the increasing size of the tree trunk. This expected age-related tree growth shape can be approximated (e.g., with a Hugershoff curve), but in reality, each tree has unique ecological conditions and will have its own tree growth trend.

The curve fitting standardization (CFS) method removes the tree growth trend from tree-ring measurements using various curves that attempt to account for growth-related trends, which assumes that these trends can be described by some type of smooth functions applicable to the tree-ring measurement series (Fritts 1976). The curves can be obtained from various models, including the smoothing spline curve (Cook and Peters 1981), negative exponential curve, linear curve, Hugershoff curve (Fritts 1976; Fang et al. 2010), smoothing spline curve (Cook and Peters 1981), time-varying-response smoothing (TVRS) method (Melvin et al. 2007), and ensemble empirical mode decomposition (EEMD) curve (Zhang and Chen 2017). Thus, CFS is an extreme standardization method, as it removes almost the entire low-frequency signal because of its application to each individual tree-ring measurement. The time-scale of the low-frequency signal is determined by the age range of each tree-ring sample, which in this case is usually multi-decadal or centennial in scale. Although a CFS standardized chronology may lose its low-frequency variability (Esper et al. 2002), it is still widely used because of its computational simplicity and adaptive capacity.

The regional curve standardization (RCS) method is designed to retain as much long-term climate variability information as possible from tree-ring data, placing it on the opposite side of the standardization spectrum from CFS. This method assumes that the environmental factors limiting growth in trees are randomly distributed and expected to cancel out when the age-aligned data are averaged to form a common growth curve (Fritts 1976; Briffa et al. 1992; Briffa and Melvin 2011). If the age-related variability is independent from climate and removed from the raw measurements, the resulting indices will contain pure climate information. However, two main sources of bias, can distort the common forcing signal when using the RCS method (Briffa and Melvin 2011). The ‘trend-in-signal’ bias is applicable if the underlying common climate signal is close to or longer than the length of the chronology, in which case the RCS chronology will be biased in both the start and the end of the chronology. The ‘differing-contemporaneous-growth-rate’ bias is related to the problem of the mixing of variable-growth-rates of trees, given that individual samples with different longevities have various tree growth rates, resulting in a spurious positive trend across the chronology. The challenge is to have a sufficient sample replication in order to: (1) produce a locally unbiased regional curve (RC) without any climate information to deal with the “trend-in-signal” issue, and (2) account for the different tree growth rates of different samples to avoid the “differing-contemporaneous-growth-rate” bias. The signal-free RCS method (Melvin and Briffa 2008) was developed to mitigate the first bias, standardizing the data using locally unbiased RCs. The two regional curves (RCs) standardization method (Esper et al. 2002) largely alleviate the second bias. Later studies have attempted to address the above described shortcomings (Melvin et al. 2013; Yang et al. 2014; Helama et al. 2017), but the problem has not yet been completely resolved.

In addition, there are a number of alternative detrending approaches, including the adaptive regional growth curve (Nicault et al. 2010), environmental curve standardization (Helama et al. 2005), basal area increment (Biondi and Qeadan 2008), Eigen analysis (Yang et al. 2012), Bayesian (Schofield et al. 2016), and response surface methods (Matskovsky and Helama 2015). However, a perfect standardization method that deals with all biases inherent to tree-ring data has yet to be found (Peters et al. 2015).

The integration of the conservative CFS method and the dynamical RCS method has great potential to ultimately produce robust climate signals from tree-ring data, especially in the low-frequency band. As an example, Björklund et al. (2013) developed a method to derive “regionally” constrained individual signal-free chronologies, where the fitting curve for each measurement was forced to have the same mean as their RC, which was then used to standardize the raw tree-ring data. This artificial setting was not successful in fully solving the issue (Helama et al. 2017). The motivation for the current study is to generate the ensemble standardization constraints to account for all possible situations, in which some extreme samples are standardized using the CFS method, and the other homogeneous samples are standardized using the RCS method. To achieve this, we introduce a novel trend similarity ranking (TSR) method, and evaluate it on three tree-ring datasets, all covering the Common Era: two TRW chronologies, namely the Yamal (Briffa et al. 2013) and Torneträsk (Melvin et al. 2013), and one maximum latewood density (MXD) chronology from Northern Scandinavia (Esper et al. 2012). We hypothesize that his new method enhances our ability to obtain low-frequency climate variability recorded by tree-ring archives.

2 Data and methods

The three chronologies used to evaluate the TSR method cover more than 2000 years, and have estimates of pith offset for each individual tree-ring series. The pith offset information ensures that the tree-ring measurements can be ordered by cambial age when performing RCS.

2.1 Tree-ring records and instrumental climate data

The datasets used to test the applicability of the TSR method consisted of the Siberian larch (Larix sibirica) TRW data from the Yamal Peninsula of northwest Siberia (Briffa et al. 2013), Scots pine (Pinus sylvestris L.) TRW data from the Torneträsk region in northernmost Sweden (Grudd 2008; Melvin et al. 2013), and Scots pine MXD data across Northern Fennoscandia (c. 66–69° N, 20–32° E) (Esper et al. 2012). The Yamal TRW data are significantly correlated with summer (June–July) temperature anomalies (from the CRUTEM4v dataset; Jones et al. 2012) in the grid-box centered on 67.5° E and 67.5° N (Briffa et al. 2013). The Torneträsk TRW data are strongly correlated with the observed mean May–August regional (Bottenviken) temperature record (Melvin et al. 2013). The instrumental temperature anomalies were calculated with respect to the 1961–1990 period in subsequent analyses. The MXD data from Northern Scandinavia are significantly correlated with the summer (June–August) temperature anomalies (Esper et al. 2012). However, instrumental data are not available in Esper et al. (2012), and the original reconstruction is therefore considered as the reconstructed target in this study. Detailed information about the three datasets, including species, longevity, sample count, and climate response, is provided in Table 1.

Table 1 Metadata for the TRW and MXD chronologies from Yamal, Torneträsk and Northern Scandinavia used in this study

Each of the three datasets contains more than 400 samples, with good distribution through time, and are thus likely to yield robust average values in every biological year and thus minimize the “trend-in-signal bias” in the RCs. The chronologies and regression uncertainties of the datasets are discussed in their original papers (Esper et al. 2012; Briffa et al. 2013; Melvin et al. 2013). Here, we focus on the uncertainty estimation in the tree-ring chronologies due to the two used standardization processes.

2.2 Trend similarity ranking (TSR) method

We propose a TSR method to quantitatively estimate the uncertainty ranges in the tree-ring chronologies associated with age-trend effect using the two standardization methods (i.e., the CFS and RCS methods). The TSR method assumes that the inner limit of the chronology uncertainty can be obtained from the CFS method, and the outer limit from the classical RCS method. The TSR method is outlined in Fig. 1 and described in detail below.

Fig. 1
figure 1

Flow chart summarizing the TSR method

Step 1::

Computing the “expected” growth curve for individual tree-ring series

The TVRS method is compared with the nine filters (including cubic spline, Bartlett–Hann, Blackman, Gaussian, Hamming, Hann (Hanning), Lanczos, Triangular, and Tukey (tapered cosine) windows) in Fig. s1. There are no distinct differences among these filters under three cut-off frequencies in Fig. s1a–c, except, due to the boundary effects, at the beginning or end of the time series. However, the time-varying-response smoothing is completely different to the spline smoothing in Fig. s1d. This is because the cut-off frequency of the time-varying-response smoothing method decreases with the time, which is an improvement from the cubic smoothing spline method (Cook and Peters 1981), with time-dependent flexibility according to the tree growth features (Melvin et al. 2007). Moreover, the different initial stiffness had no distinct impact on the time-varying-response smoothing, except for the beginning of the time series in Fig. s1d. Thus, the TVRS method (Melvin et al. 2007) was used to represent the intrinsic growth trend in each tree-ring series for the CFS method with 0 year initial stiffness.

Step 2::

Establishing the regional growth curve

The expected regional growth curve was derived using the classical RCS method (Briffa et al. 1992). An arithmetic mean curve was computed of all the tree-ring series aligned by their biological ages (years from pith) (Briffa et al. 1992). The mean curve was then smoothed by the TVRS method (Melvin et al. 2007) to obtain a biologically justified, monotonic, decreasing the RC.

A number of additional methods for computing the RCs (the median value, Tukey’s biweight robust mean, and maximum probability value) are compared and analyzed in Fig. s2, based both on the raw measurements and TVRS smoothed measurements. The results show that the Tukey’s biweight robust mean is intermediate between the median and the arithmetic mean values. It is easier for the TVRS smoothed tree-ring measurements to obtain a consistent RC than for the unsmoothed tree-ring measurements. Thus, the Tukey’s biweight robust mean for the TVRS smoothed tree-ring measurements was used to obtain the regional growth curve in this study.

Step 3::

Comparing the RC and curve fitting trends

Linear regressions of the RC (obtained in step 2) and the curve fitting trend for each individual tree-ring series (step 1) were computed, where a regression coefficient of 1 means that the long-term trend in the tree-ring series is the same as the RC (ignoring the regression error). All tree-ring measurement series were ranked according to the regression coefficient.

Step 4::

Looping through the standardization process

The first tree-ring index TRI1 member (chronology) was calculated from the Tukey’s biweight robust mean of all (n) standardized measurement series according to the calendar year, when all (n) measurement series are assumed to be standardized using their curve fitting trends. This process is completely consistent with the CFS method. The last member TRIn is that the TRIn chronology is derived from the Tukey’s biweight robust mean of all the measurement series standardized according to the classical RCS method. The ith chronology (TRIi) represents the first i measurement series that was standardized using the RC, since its curve fitting trends is more similar to the RC than the remaining ones. The remaining (ni) measurements were standardized using their curve fitting trends. The final ith chronology (TRIi) was calculated by taking the Tukey’s biweight robust mean of all the standardized TRW series according to their calendar year. The specific formulas are as follows,

$$TRI_{i} = {\text{mean}}\left( {\frac{{TRM_{{\left[ {i, \ldots ,n - 1} \right]}} }}{{CFT_{{\left[ {i, \ldots ,n - 1} \right]}} }} + \frac{{TRM_{{\left[ {1, \ldots ,i - 1} \right]}} }}{RC}} \right), \quad i = 2, \ldots n - 1$$
(1)
$$TRI_{1} = {\text{mean}}\left( {\frac{{TRM_{{\left[ {1, \ldots ,n} \right]}} }}{{CFT_{{\left[ {1, \ldots ,n} \right]}} }}} \right)$$
(2)
$$TRI_{n} = {\text{mean}}\left( {\frac{{TRM_{{\left[ {1, \ldots ,n} \right]}} }}{RC}} \right)$$
(3)

where tree-ring index (TRIi) is the ith member of the tree-ring index chronology. TRM[i,…n] are the tree-ring measurement from the ith to nth series. n is the number of the tree-ring measurement series. CFT[i,…n] are the curve fitting trends of the tree-ring measurement series from ith to nth, and RC is the regional tree growth curve.

Step 5::

Building a TSR chronology

All chronologies (i.e., members) obtained in step 4 were sorted according to their calendar years. The TSR chronology (50th percentile equal to the median) and its uncertainty range (5th–95th percentiles) were defined as the 50th, 5th, and 95th percentiles along the calendar year of the chronologies. The specific formulas were as follows:

$$TSR_{u} = {\text{prctile}}(TRI_{{\left[ {1, \ldots ,n} \right]}} ,95{\% })$$
(4)
$$TSR_{l} = {\text{prctile}}\left( {TRI_{{\left[ {1, \ldots ,n} \right]}} ,5{\% }} \right)$$
(5)
$$TSR = {\text{prctile}}\left( {TRI_{{\left[ {1, \ldots ,n} \right]}} ,50{\% }} \right)$$
(6)

where the TSRu is the upper limit of all chronologies, TSRl is the lower limit of all chronologies, and TSR is the median values of all chronologies. In methods used to reconstruct climate over the past millennia, the choice of the 50th percentile to represent the final result of the ensemble reconstructions is an idiomatic expression (e.g., Neukom et al. 2014). In general, the median and mean are more or less identical, but the median is more robust in regard to outliers. Thus, this choice has very little effect on the result. On the other hand, the 50th percentile corresponds to the maximum of the probability density function distribution if the uncertainty has a normal distribution. This means that the 50th percentile has the highest probability of occurrence at the same frequency.

2.3 Ensemble empirical mode decomposition (EEMD) method

The EEMD method is used to adaptively decompose tree-ring reconstructions into different climate components with various time-scales. It is an adaptive time-frequency data analysis approach that handles nonlinear and non-stationary time-series (Huang et al. 1998; Wu and Huang 2009). The EEMD technique has been applied in many dendroclimatological and meteorological studies (see e.g., Guan et al. 2012, 2018; Qian et al. 2011; Shi et al. 2012; Fang et al. 2013). In dendroclimatology, the EEMD method has been used to analyze paleoclimate variability in TRW chronologies over multiple timescales (Guan et al. 2012) and to improve transfer functions linking tree ring observations to meteorological variables across different frequency bands (Shi et al. 2012). Recently, the EEMD approach was considered as an alternative standardization method for removing the age-dependent growth trend from TRW series (Tian et al. 2015; Zhang and Chen 2017), and extracting non-climatic disturbances from individual tree-ring measurements (Fang et al. 2013).

For monthly and annual climate datasets, we used the EEMD method parameter values proposed by Qian (2016). The standard deviation of the added white noise in each EEMD member was set to 0.4 and the ensemble size to 1000. In fact, an increase in noise amplitudes and a greater number of ensemble members was shown to have a minor influence on the decomposition, as long as the added noise has a moderate amplitude and the ensemble large enough number of trials. This is because the added noise will theoretically be cancelled out by a large ensemble mean (Wu and Huang 2009; Qian 2016).

2.4 Determination of linear and non-linear trends

The linear trend is often used to reflect the global warming rate (Hartmann et al. 2013). Here, the linear trend was calculated using the function ‘detrend’ in MATLAB through a least-squares fit of a straight line, and the non-linear trend was derived using the EEMD method. The EEMD trend (TEEMD) was defined by Qian (2016) as:

$$T_{EEMD} = \frac{{R_{eyr} - R_{syr} }}{eyr - syr + 1} \times 100$$
(7)

where the syr and eyr are the start and end years of the analyzed period, respectively, and Rsyr and Reyr are the values of the curve fitting trend using the EEMD method at the start year and end years of the analyzed period, respectively.

2.5 Determination of the contribution of different components across time-scales

The EEMD method was also used to decompose various components of temperature reconstructions across different time-scales. Based on previous studies (Mann et al. 1995; Shi et al. 2017), the inter-annual component was defined as < 8 years, inter-decadal component as ≥ 8 years and < 35 years, multi-decadal component as ≥ 35 years and < 100 years, and centennial component as ≥ 100 years and < 600 years. The secular trend is the last component, and the millennium-scale variability is the sum of the remaining components.

The contribution of the different components (CTEEMD) across the time scale was defined as follows:

$$CT_{EEMD} = \frac{{C_{eyr} - C_{syr} }}{eyr - syr + 1} \times \frac{100}{{T_{LSF} }}$$
(8)

where the Csyr and Ceyr are the values of the EEMD component in the start year (syr = 1901) and end year (eyr = 2000) of the 20th century. The TLSF is the linear trend of the instrumental temperature record during the 20th century obtained using the least-squares fit method.

3 Results and discussion

3.1 Examples of the two standardization approaches in the TSR method

We provide six examples to illustrate the results of using the CFS and RCS methods on the detrended measurements (Fig. 2). Figure 2a, c, e show examples of significant and positive relationships between the curve fitting trends of the measurements and the RC, and Fig. 2b, d, f show examples of negative relationships. In Fig. 2a, c, e, there are no obvious differences between the two methods, except for a small difference in the means, but there are distinctly different variabilities in the low-frequency band at the start of the detrended measurements in Fig. 2b, d, f. The magnitudes of the CFS results are obviously larger than those of the RCS at the start of the detrended series, but become smaller at the end. These spurious variabilities or trends generate the ‘differing-contemporaneous-growth-rate’ bias to the RCS results, and can be removed using the CFS method. This demonstrates the necessity to include both approaches in the TSR method.

Fig. 2
figure 2

Six examples of the CFS (red lines) and RCS (blue lines) results. The raw tree-ring measurements (black lines) are divided by the curve fitting trend (red dashed lines) and RC (blue dashed lines)

3.2 Ensemble TSR chronologies

Using the TSR method, we calculated the uncertainty range around the three chronologies arising from tree growth trends (Fig. 3). Not surprisingly, the RCS chronologies are closer to the outer limits of the uncertainty for the maximum low-frequency signals, and the CFS chronologies approach the inner limits of the uncertainty for the minimum low-frequency signals.

Fig. 3
figure 3

Uncertainty ranges (gray shading) of the influences of the tree growth trend on the TRW chronologies from a Yamal and b Torneträsk, and c MXD chronologies from Northern Scandinavia. The TSR chronology (red line) is the 50th percentiles of the uncertainties using the TSR method. The CFS chronology is the black line. The RCS chronology is the blue line. All chronologies were filtered with the LOWESS method (span = 0.1). The gray line is the sample count

The TSR chronologies, derived from the 50th percentile of the uncertainty range, lies between the CFS and RCS chronologies. There are generally consistent variabilities in the TSR and RCS chronologies, except in the start and end parts of the chronologies, but there are discernible differences between the TRS and CFS chronologies. The amplitudes of the three chronologies are very different, but this does not depend on the varying number of samples through time. This demonstrates that the standardization process of the tree-ring measurements significantly affects the amplitudes of the low-frequency variability. The RCS chronologies have the largest standard deviations, but it is difficult to verify if this is due to regional climate variability. For example, there is a sharp upward trend at the end of the RCS TRW chronologies (Fig. 3a, b), but this trend is not obvious in the RCS MXD chronology (Fig. 3c), which is supposed to represent a superior temperature signal compared to the TRW records in Torneträsk and may be caused by the end effect.

To further assess this trend, we compared the two TRW chronologies and the MXD chronology with a number of independent, proximal, temperature-sensitive, proxy records (Fig. 4). These include a speleothem-based oxygen isotope record from Okshola cave, northern Norway (Linge et al. 2009), two oxygen isotope records from the Lomonosovfonna and Austfonna ice cores, Svalbard (Isaksson et al. 2005), five chironomid records from Lake Hampträsk, southern Finland (Luoto et al. 2009), Lake Pieni-Kauro, eastern Finland (Luoto and Helama 2010), Lake Njulla (Larocque and Hall 2004), Lake 850 (Larocque and Hall 2004), and Lake Vuoskkujavri (Larocque and Hall 2004), and a height increment of chronology from northern Finland (Lindholm et al. 2011). None of these proxy records have such sharp upward trends at the end of their chronologies as the RCS TRW chronologies (Fig. 3), although all proxy records have some limitations.

Fig. 4
figure 4

Comparison of non-tree-ring proxy records of (a) a speleothem-based oxygen isotope ratio record from Okshola cave, northern Norway (Linge et al. 2009), (b) two oxygen isotope records from the Lomonosovfonna and Austfonna ice cores, Svalbard (Isaksson et al. 2005), (c–f) five chironomid records from Lake Hampträsk, southern Finland (Luoto et al. 2009), Lake Pieni-Kauro, eastern Finland (Luoto and Helama 2010), Lake Njulla (Larocque and Hall 2004), Lake 850 (Larocque and Hall 2004), Lake Vuoskkujavri (Larocque and Hall 2004), and (i) height increment of chronology from Fennoscandia (Lindholm et al. 2011). The gray shading bars indicates the 20th century

Moreover, the uncertainty ranges at the start and end of the chronologies are wider than in the middle for the TRW datasets, and there are distinct downward trends at the beginning and upward trends at the end of the chronologies. The divergence in the early part of the chronologies is of less concern in climate reconstructions, as this part of the chronology is usually omitted due to low sample counts. However, the divergence in the more recent part of a TRW chronology affects the reliability of any calibration with the instrumental meteorological data. The regression coefficient of the transfer function will severely affect the amplitude of the reconstruction and the long-term trend. For the CFS chronologies, the trends in the more recent parts of the chronologies are underestimated because the CFS method suppresses most of the low-frequency signal. In contrast, the low-frequency signal can be overestimated using the RCS method [e.g., the end-effect bias (Briffa and Melvin (2011)]. Although the signal-free RCS method (Melvin and Briffa 2008) may mitigate part of the end-effect bias, there is presently no complete solution to this problem. In addition, there is no such as distinct upward trend in the RCS MXD chronology. Thus, in order to establish a more robust confidence interval for TRW climate reconstructions, it is essential to recognize and estimate the uncertainty related to the end-effect bias.

3.3 Three temperature reconstruction examples

In this section, we provide three examples on how to obtain the confidence intervals for tree-ring-based temperature reconstructions. We follow prior recommendations, and only use those parts of the chronologies for analysis where the sample depth is sufficiently high and the expressed population signal value exceeds the 0.85 threshold (Wigley et al. 1984). For the Yamal record this corresponds to 95 BC–2005 AD (Briffa et al. 2013), for the Torneträsk record the period is AD 81–2010 (Melvin et al. 2013), and for the Northern Scandinavia record this is 136 BC–2006 AD (Esper et al. 2012). By truncating the chronologies, we completely avoided the start effect bias.

Figure 5a shows that the Yamal signal-free RCS reconstruction lies between our new TSR reconstruction and its lower limit of uncertainty, indicating that the signal-free RCS reconstruction does not diverge from the uncertainty estimation obtained using the TSR method. Moreover, Fig. 5b also shows that the Torneträsk TSR reconstruction is close to the original reconstruction of Melvin et al. (2013), which was derived from the two RCs standardization method. This illustrates that a TSR-derived chronology may be similar to a RCS chronology derived from two RCs standardization, but more tests are needed to confirm this. These two original reconstructions using the two variants of the RCS method both fall within the uncertainty range in Fig. 5a, b, indicating that the uncertainty quantification using the TSR method includes these two variants.

Fig. 5
figure 5

Comparison of the TSR reconstruction (red line) with the a signal-free RCS reconstruction (green line) from Briffa et al. (2013) in Yamal, b two RCs standardization reconstruction (green line) from Melvin et al. (2013) in Torneträsk, and c multiple RCs standardization reconstruction (green line) from Esper et al. (2012) in Northern Scandinavia. The upper and lower limits of the TSR reconstruction are represented by the blue and black lines, respectively. All reconstructions are filtered with a 100-years moving average

Figure 5c also shows the TSR reconstruction is similar to the original reconstruction of Esper et al. (2012), however, the amplitude of the original reconstruction of Esper et al. (2012) is a little larger than the uncertainty range of the TSR reconstruction, a possible reason is that any predictor cannot exceed the predictand. In this case, the original reconstruction is considered to be the reconstructed target, because the instrumental data is not available.

As noted above, the “differing-contemporaneous-growth-rate bias” can produce spurious positive trends in the modern parts of chronologies (Briffa and Melvin (2011). Weak trends in the TSR reconstructions in Figs. 3 and 5 indicate that such end effect bias has been effectively mitigated by the TSR method. Moreover, the signal-free RCS results are located in the range of the ensemble TSR reconstructions in Fig. 5a, which means the signal-free RCS method can migrate the trend-in-signal bias. In theory, the TSR ensemble reconstructions include two ideal situations, whereby some samples have common RC that can yield a locally unbiased RC using the RCS method to account for the trend-in-signal bias, and then other samples with different tree growth rates can be standardized using the CFS method to account for the differing-contemporaneous-growth-rate bias. Thus, it is easy to understand that the performance of the TSR ensemble reconstructions broadly includes the signal-free RCS and multiple RCs reconstructions in Fig. 5.

The new TSR reconstructions and its uncertainty ranges are further compared with the original signal-free RCS reconstruction from Briffa et al. (2013) (Fig. 6), two RCs reconstruction from Melvin et al. (2013) (Fig. 7), and multiple RCs reconstruction from Esper et al. (2012) (Fig. 8) over different time-scales, ranging from inter-annual to millennial time-scales. The results show clear consistencies between the Yamal TSR and the signal-free RCS reconstructions, and its upper and lower limits at inter-annual to multi-decadal time-scales (Fig. 6a–c). At centennial time-scales, the difference between the TSR reconstruction and signal-free RCS reconstructions is more pronounced. The Yamal signal-free RCS reconstruction has a distinct upward trend after AD 1830, which is close to the upper limit of the uncertainty. The TSR reconstruction and lower limit of uncertainty both have similar upward trends after AD 1830, but both show a hiatus after ca. AD 1920. This would weaken the direct comparison of the TSR reconstruction with the instrumental temperatures, but the two reconstructions are still within the TSR uncertainty. The discrepancy between the signal-free RCS and TSR reconstructions increases even further at the millennial time-scale. Specifically, the signal-free RCS reconstruction captures a very cold Little Ice Age as compared with the TSR reconstruction, even if the two reconstructions both show a rather small variability before AD 1400. However, over even longer time-scales (Fig. 6f), the signal-free RCS reconstruction shows a dominant long-term positive trend after AD 445. Interestingly, this finding conflicts the results of Esper et al. (2012), where an orbital-scale negative trend was found during the past millennia in the MXD based summer temperature reconstruction used in this study. The TSR reconstruction generally has less variability than in the Briffa et al. (2013) reconstruction, and is characterized by first a decreasing and then an increasing trend over the past millennium. Consequently, the differences in the millennial time-scale and in long-term trends between the TSR reconstruction and the signal-free RCS reconstruction cannot be used to prove which one is closer to reality. However, the TSR reconstruction is able to provide an estimate of uncertainty from the standardization procedure.

Fig. 6
figure 6

Comparison of the TSR reconstruction (red line) with the signal-free RCS reconstruction (green line) from Briffa et al. (2013) in Yamal across different time-scales. The upper and lower limits of the TSR reconstruction are represented by the blue and black lines, respectively. a Inter-annual component and b inter-decadal component both start from AD 1800 in order to clearly show the consistency of changes, ce multi-decadal, centennial-, and millennial-scale components; f the long-term trend. All components were decomposed with the EEMD method

Fig. 7
figure 7

Same as Fig. 6 except for the reconstructions are for the Torneträsk record. The green line represents the two RCs standardization reconstruction from Melvin et al. (2013)

Fig. 8
figure 8

Same as Fig. 6 except for the reconstructions are for the Northern Scandinavia record. The green line represents the multiple RCs standardization reconstruction from Esper et al. (2012)

Figure 7 shows similar results to those in Fig. 6, whereby the variabilities in the new Torneträsk TSR reconstruction (and the associated uncertainty ranges) and the two RCs standardization reconstruction from Melvin et al. (2013) are consistent on inter-annual to multi-decadal time-scales (Fig. 7a–c). However, at the centennial-scale, there is a clear difference between the two time series, although this difference is still within the uncertainty range. On the millennial time-scale, there is a distinct difference in the amplitudes between the two Torneträsk reconstructions. The two RCs standardization reconstruction even exceeds the range of the uncertainty of the new TSR reconstruction during some intervals (e.g., AD 1500–1900), the two series are, however, almost identical in their long-term trends. Thus, this further verifies that the TSR reconstruction provides results close to the two RCs reconstruction method.

Figure 8 shows similar results to those shown in Figs. 6 and 7, in that there are no distinct differences in the first four components. However, the magnitudes of the millennial component and the secular trend of the TSR reconstruction are distinctly smaller than the multiple RCs reconstruction shown in Fig. 8e, f.

There were no significant differences in the components on inter-annual to centennial time-scales for the investigated standardization methods, indicating that the choice of standardizing process has barely effect on those time scales. The visible differences appeared in the low-frequency bands, particularly in the secular trends, which did not differ from the uncertainty estimations. An exception was the visible difference between the TSR and the original signal-free RCS method at the millennial-scale variability and the secular trend in Fig. 6e, f, which may indicate the advantages of the signal-free RCS method.

In the TSR reconstruction for the Yamal, the 100-years mean temperature (0.18 °C) during the period AD 1423–1522 is the warmest temperature over the past two millennia, and is warmer than the recent most 100-years mean temperature (0.11 °C) from AD 1905–2004. However, the warmest 100-years observed temperature is 0.22 °C from AD 1909 to 2008, where the observations go back to 1883, even the Yamal reconstruction severely underestimates the observed temperatures with 50%. Thus, we cannot yet determine whether the modern warm period was actually the warmest over the past millennia. For Torneträsk, the TSR reconstructions indicate that May–August mean temperatures during recent most 100-years period (AD 1911–2010, 0.02 °C) were not unprecedented during the past two millennia, but the warmest century was encountered in AD 359–458 with a mean temperature of 0.12 °C. In Torneträsk the TSR reconstruction is more similar to observations: the mean May–August temperature in AD 1907–2006) reached 0.04 °C (instrumental data going back to AD 1860–2006), which is consistent with the previously published conclusions of Melvin et al. (2013). In the Northern Scandinavia TSR reconstruction, the warmest 100-years mean June–August temperature (0.36 °C) occurred in AD 24–123, which is warmer than the 0.04 °C reconstructed for the recent most 100-years period (AD 1907–2006). Thus, the TSR reconstructions suggest that summer temperatures during the last century were not the warmest over the past two millennia.

Tables 2 and 3 show the warming rates in the Yamal and Torneträsk. The warming linear trend rate during the 20th century in Yamal is 0.40 °C/100 years from the instrumental data, which is larger than the three TSR reconstructions, but it is not the warmest trend; this occurred from AD 1812–1911 (2.64 °C/100 years). The Torneträsk results show similar features. The warming trend rate (0.85 °C/100 years) during the 20th century from the instrumental period is lower than that from AD 336–435 (3.31 °C/100 years). Similarly, the warming rate from AD 1453–1552 (1.37 °C/100 years) in Northern Scandinavia is larger than that (0.37 °C/100 years) during the 20th century from the original reconstruction (Esper et al. 2012). This indicates that the increasing warming rate of the recent 20th century is not unprecedented over the past two millennia in the three studied regions in the scenario, when the uncertainty of the influence of the tree growth trend is considered. However, these results may be affected by other issues (e.g., the number of tree-ring samples available through time).

Table 2 Comparison of the warming trend rates (unit: °C/100 year) defined by the magnitude of the 100-years linear trends in the Yamal, Torneträsk and Northern Scandinavia
Table 3 The contributions of EEMD components to 20th century warming for the three studied regions

The contributions of the different time-scale components on the 20th century warming in the three regions obtained using Eq. (8) are shown in Table 3. The multi-centennial variabilities significantly contribute to the linear 20th century warming trend with a range of 78.19–147.88% at Yamal, 29.91–72.82% at Torneträsk, and 9.19–31.38% in Northern Scandinavia (Table 3). In addition, the millennial-scale variabilities also contribute with a 41.26–84.14% at Yamal, 23.62–30.81% at Torneträsk, and 18.51–25.09% in Northern Scandinavia. The original reconstructions all show similar results. This implies that the multi-centennial- and millennial-scale temperature variability make essential contributions to the 20th century warming.

We further verified the robustness of the TSR reconstructions by comparing the different methods using spectral (Fig. 9) and wavelet (Fig. 10) analysis. The patterns of the spectra for the TSR and signal-free RCS reconstructions are similar (Fig. 10a, b; based on a continuous spectrum analysis). The first significant (p < 0.01) spectral peak in the TSR reconstruction was identified at 60.1 years in Yamal, which is exactly the same as in the signal-free RCS reconstruction. The first dominant cycle (60.1 years) is also the most significant cycle over most of the significant period in the Morlet wavelet power spectra for both two reconstructions in Fig. 10a, b.

Fig. 9
figure 9

Spectra (black lines) of the TSR reconstruction, the signal-free RCS reconstruction in Yamal from Briffa et al. (2013), two RCs standardization reconstruction in Torneträsk from Melvin et al. (2013), and multiple RCs standardization reconstruction in Northern Scandinavia (NS) from Esper et al. (2012). The red line is the 95% confidence level and the blue line is the red noise

Fig. 10
figure 10

Wavelet power spectra of the TSR reconstruction, the signal-free RCS reconstruction in Yamal from Briffa et al. (2013), two RCs standardization reconstruction in Torneträsk from Melvin et al. (2013), and multiple RCs standardization reconstruction in Northern Scandinavia from Esper et al. (2012). The regions of statistical significance at the 95% level are bordered by the black curves. Shaded regions outline the cone against the edge effects

The similarities in the spectral analysis between the TSR reconstruction and the two RCs reconstruction in Torneträsk (Fig. 9c, d) are not surprising. The most significant spectral peak in the two RCs reconstruction is 275.7 years, which is the third dominant peak in the TSR reconstruction. This cycle is clearly showed as a maximum in the wavelet power spectra throughout most of the period shown in Fig. 10c, d.

The most significant spectral peak (112.8 years) in the multiple RCs reconstruction is also the first dominant peak in the TSR reconstruction, which is distinctly illustrated from the wavelet power spectra in the Fig. 10e, f. This further verifies that the TSR reconstruction is consistent with the multiple RCs reconstruction. The lower limit of the TSR reconstruction better describes periodicity on centennial to millennial time-scales as compared with the signal-free RCS and multiple RCs reconstructions. The upper limit of the TSR reconstruction has stronger high-frequency periodicity than the other two reconstructions (figure not shown).

However, the original reconstructions in three sites are all derived from the variants of the RCS method as an extreme situation to retain maximumly low-frequency signal. In addition, the MXD records have more low-frequency temperature signal than the TRW records (Esper et al. 2012). Moreover, the original reconstruction for the MXD records is considered as a TSR target. In theory, the standard deviation of the predictor must be smaller than the standard deviation of the predictand. Here the small standard deviation is mainly reflected in low-frequency band. Thus, the magnitudes of the millennial component and the long-term trend of the TSR reconstruction in Fig. 8e are smaller than those of the multiple RCs reconstruction in Fig. 8f, and the power spectrum of the TSR reconstruction over millennial-scale in Fig. 9e is smaller than the multiple RCs reconstruction in Fig. 9f.

In summary, the phases or periodicities obtained using the TSR method are very similar to those obtained by the signal-free RCS and two RCs standardization methods. This indicates that the standardization method has no substantive influence on the periodicities of tree-ring chronologies, so the advantage of the TSR method is that it can provide an ensemble to account for some extreme situations of tree-growth deviation from the regional growth trend.

4 Conclusions

Removing the tree growth trend from tree-ring measurement, especially TRW chronology, has always been a key scientific issue, since it may have an significant impact on tree-ring based paleoclimate reconstructions. Here, we propose a TSR method to quantify the uncertainty of the influence of the tree growth trend on the tree-ring chronologies.

In the TSR method, part of the measurements are standardized using an ideal regional growth curve and another part using their long-term trends. This generates different members according to the ranked similarity between a regional growth curve and the curve fitting trend of each measurement. Our results indicate that tree growth trends mainly affect the secular trends in TRW chronologies, but have no effects on the variability over inter-annual to multi-decadal time-scales.

The TSR method was successfully used to reconstruct summer temperature variability in Yamal, Torneträsk, and Northern Scandinavia. Our results suggest that the recent linear warming trend rate during the recent most century is not unprecedented over the past two millennia, based on the TRW and MXD records. More importantly, the multi-centennial and millennial-scale variabilities of the three TSR chronologies significantly enhance the warming trend during the 20th century in the three studied regions.

The two main biases in the classical RCS method are discussed. The end effect due to the “differing-contemporaneous-growth-rate” bias is effectively mitigated (Fig. 3), and the “trend-in-signal” bias is overcome by the TSR ensemble reconstructions through using all possible locally unbiased RCs. The uncertainty quantification using the TSR method improves error estimates when undertaking tree-ring climate reconstruction in other areas, e.g., multi-proxy climate reconstructions, reconciliation of tree-ring and pollen data, or tree-ring data assimilation in climate models.

However, there is no “true” climate variability information on millennium or longer timescales to help distinguish between tree-ring growth and climate trends for the specific tree-ring measurements. Moreover, the TSR method has the same limitations as the RCS method (e.g., the essential need for pith offset information that is unavailable for the vast majority of all existing tree-ring data sets).