Abstract
A range of existing statistical approaches for reconstructing historical temperature variations from proxy data are compared using both climate model data and real-world paleoclimate proxy data. We also propose a new method for reconstruction that is based on a state-space time series model and Kalman filter algorithm. The state-space modelling approach and the recently developed RegEM method generally perform better than their competitors when reconstructing interannual variations in Northern Hemispheric mean surface air temperature. On the other hand, a variety of methods are seen to perform well when reconstructing surface air temperature variability on decadal time scales. An advantage of the new method is that it can incorporate additional, non-temperature, information into the reconstruction, such as the estimated response to external forcing, thereby permitting a simultaneous reconstruction and detection analysis as well as future projection. An application of these extensions is also demonstrated in the paper.
Similar content being viewed by others
References
Allen MR, Stott PA (2003) Estimating signal amplitudes in optimal fingerprinting. Part I: theory. Clim Dyn 21:477–491
Ammann CM, Joos F, Schimel D, Otto-Bliesner BL, Tomas R (2007) Solar influence on climate during the past millennium: results from transient simulations with the NCAR Climate System Model. Proc Nat Acad Sci 104:3713–3718
Briffa KR, Osborn TJ, Schweingruber FH, Harris IC, Jones PD, Shiyatov SG, Vaganov EA, (2001) Low-frequency temperature variations from a northern tree-ring density network. J Geophys Res 106:2929–2941
Brohan P, Kennedy JJ, Haris I, Tett SFB, Jones PD (2006) Uncertainty estimates in regional and global observed temperature changes: a new dataset from 1850. J Geophys Res 111:D12106
Brown PJ (1993) Measurement, regression, and calibration. Oxford University Press, Oxford, 201 pp
Bürger G, Cubasch U (2005) Are multiproxy climate reconstructions robust? Geophys Res Lett 32:L23711
Bürger G, Fast I, Cubasch U (2006) Climate reconstruction by regression—32 variations on a theme. Tellus 58A:227–235
Caines PE (1988) Linear stochastic systems. Wiley, New York, 847 pp
Coelho CAS, Pezzulli S, Balmaseda M, Doblas-Reyes JJ, Stephenson D (2004) Forecast calibration and combination: a simple Bayesian approach for ENSO. J Clim 17:1504–1516
Durbin J, Koopman SJ (2001) Time series analysis by state space methods. Oxford University Press, Oxford, 253 pp
Esper J, Cook ER, Schweinngruber FH (2002) Low-frequency in long treering chronologies for reconstruction past temperature variability. Science 295:2250–2253
Esper J, Frank DC, Wilson RJ, Briffa KR (2005) Effect of scaling and regression on reconstructed temperature amplitude for the past millennium. Geophys Res Lett 32:L07711
Fierro RD, Golub GH, Hansen PC, O’Leary DP (1997) Regularization by truncated total least squares. SIAM J Sci Comput 18:1223–1241
Flato G, Boer GJ (2001) Warming asymmetry in climate change experiments. Geophys Res Lett 28:195–198
Fuller WA (1987) Measurement error models. Wiley, New York, 440 pp
Harvey A (1989) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge, 554 pp
Hegerl GC, Crowley TJ, Baum SK, Kim K-Y, Hyde WT (2003) Detection of volcanic, solar, and greenhouse gas signals in paleo-reconstructions of Northern Hemispheric temperature. Geophys Res Lett 30:1242 doi:10.1029/2002GL016635
Hegerl GC, Crowley TJ, Allen MR, Hyde WT, Pollack HN, Smerdon JE, Zorita E (2007) Detection of human influence on a new, validated 1500 year temperature reconstruction. J Clim 20:650–666
Jensen JL, Petersen NV (1999) Asymptotic normality of the maximum likelihood estimator in state space models. Ann Stat 27:514–535
Jones PD, Osborn TS, Briffa KR (1997) Estimating sampling errors in large-scale temperature averages. J Clim 10:2548–2568
Jones, PD, Briffa KR, Barnett TP, Tett SFB (1998) High-resolution palaeoclimatic records for the last millennium: Integration, interpretation and comparison with general circulation model control run temperatures. Holocene 8:455–471
Juckes MN, Allen MR, Briffa KR, Esper J, Hegerl GC, Moberg A, Osborn TJ, Weber SL, Zorita E (2006) Millennial temperature reconstruction intercomparison and evaluation. Clim Past Discuss 2:1001–1049
Kalman R (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82:34–45
Lee TCK (2008) On statistical approaches to climate change analysis. University of Victoria, PhD dissertation, to be available on https://dspace.library.uvic.ca:8443/dspace/
Mann ME, Bradley RS, Hughes MK (1998) Global-scale temperature patterns and climate forcing over the past six centuries. Nature 392:779–787
Mann ME, Rutherford S, Wahl E, Ammann CM (2005) Testing the fidelity of methods used in proxy-based reconstructions of past climate. J Clim 18:4097–4107
Mann ME, Rutherford S, Wahl E, Ammann CM (2007) Robustness of proxy-based climate field reconstruction methods. J Geophys Res (in press)
Moberg A, Sonechkin DM, Holmgren K, Datsenko NM, Karlen W (2005) Highly variable Northern Hemisphere temperatures reconstructed from low- and high-resolution proxy data. Nature 433:613–617
Osborn TJ, Raper SCB, Briffa KR (2006) Simulated climate change during the last 1,000 years: comparing the ECHO-G general circulation model with the MAGICC simple climate model. Clim Dyn 27:185–197
Rutherford S, Mann ME, Osborn TJ, Bradley RS, Briffa KR, Hughes MK, Jones PD (2005) Proxy-based Northern Hemisphere surface temperature reconstructions: Sensitivity to methodology, predictor network, target season, and target domain. J Clim 18:2308–2329
Schneider T (2001) Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values. J Clim 14:853–871
Shumway RH, Stoffer DS (1982) An approach to time series smoothing and forecasting using the EM algorithm. J Time Ser Anal 3:253–264
Shumway RH, Stoffer DS (2000) Time series analysis and its applications. Springer, Heidelberg, 549 pp
von Storch H, Zorita E, Jones JM, Dimitriev Y, Gonzalez-Rouco F, Tett SFB (2004) Reconstructing past climate from noisy data. Science 306:679–682
von Storch H, Zorita E, Jones JM, Gonzalez-Rouco F, Tett SFB (2006) Response to comment on “Reconstructing past climate from noisy data”. Science 312:529c
Wahl ER, Ritson DM, Ammann CM (2006) Comment on “Reconstructing past climate from noisy data”. Science 312:529b
Zorita E, von Storch H (2005) Methodical aspects of reconstructing non-local historical temperatures. Memorie Soc Astron Ital 76:794–801
Zorita E, Gonzalez-Rouco JF, Legutke S (2003) Testing the Mann et al. (1998) approach to paleoclimate reconstructions in the context of a 1000-yr control simulation with the ECHO-G coupled climate model. J Clim 16:1378–1390
Acknowledgments
We thank Caspar Ammann, Gabriele Hegerl and Eduardo Zorita for providing their data for use in this study. We also thank Gabriele Hegerl for helpful and constructive discussion. We gratefully acknowledge that Terry Lee was supported by the Canadian Foundation for Climate and Atmospheric Science through the Canadian CLIVAR Research Network. Work by Min Tsao was supported by the Natural Sciences and Engineering Research Council through a Discovery Grant. This paper was improved by insightful and helpful comments provided by Scott Rutherford, Walter Skinner, Xuebin Zhang and an anonymous referree.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Expectation maximization algorithm
Here we present the steps required to estimate the unknown parameters that specify the state-space model (Eq. 6). We use \({\Theta}= \{\zeta,R,\phi,\Upsilon, Q, {\mu,}\Sigma\}\) to represent the vector of unknown parameters. The derivations presented here are expanded from Shumway and Stoffer (1982, 2000, pp. 321–325). Detail derivations of the following procedure are given in Lee (2008). We first write down the likelihood function for {P t ; t = 1, 2, ..., n} as in Shumway and Stoffer (2000). Ignoring a constant, the likelihood function L P(Θ) can be expressed as:
where Ω t = ζ2 S t-1 t + R > 0 and ɛ t = P t − ζT t-1 t for t = 1, 2, ..., n. The calibration period hemispheric mean temperatures {T t ; t = n + 1, ..., n + m } are not involved in the above likelihood function. To better utilize the available information, we have modified the likelihood function to include {T t ; t = n + 1, ..., n + m}. Ignoring a constant, one can express the likelihood function for the observation data set {P t ,T s ; t = 1, 2, ..., n; s = n + 1, ..., n + m}, namely L P,T(Θ), as
where Ω t and ɛ t is defined as before for t = 1, 2, ..., n. For t = n + 1, ..., n + m, Ω t = R and ɛ t = P t − ζT t . The goal here is to find the Θ values that maximize the likelihood function L P,T(Θ). Using the EM algorithm, such Θ values can be found through an iterative procedure. The estimation procedure is summarized as follows.
-
1.
Select the starting values for the parameters \(\Theta^{(0)}= \{\zeta^{(0)},R^{(0)},\phi^{(0)}, \Upsilon^{(0)},Q^{(0)}, \mu^{(0)}\}\) while fixing Σ, one of the initial condition for the Kalman recursion. (In our analysis, we set Σ = 0.05. Results were almost identical when different Σ values were used.) On iteration j, (j = 1, 2, ...), do steps 2–4.
-
2.
Let N = n + m. Using Θ(j-1), compute the Kalman filter and smoother estimates using Eqs. (7) and (8) for t = 1, 2, ..., N and the likelihood function lnL P,T(Θ(j-1)). Also calculate, for t = N − 1, N − 2, ..., 0
$$ S_t^{N} = S_t^t + J_t^2(S_{t+1}^{N}-S_{t+1}^t) $$and for t = n, n − 1, ..., 0,
$$ \begin{aligned} C_t &= J_{t-1}S_{t}^{N}\\ {\widetilde{{\mathbf{T}}}}_t^{N}&={{\mathbf{T}}}_t^{N}+(J_tJ_{t+1}\ldots J_n)({{\mathbf{T}}}_{n+1}-{{\mathbf{T}}}_{n+1}^N)\\ {\widetilde{S}}_t^{N}&=S_t^{N}- (J_t J_{t+1}\ldots J_n)^2S_{n+1}^N \end{aligned} $$and the following quantities:
$$ \begin{aligned} Z_{11}&=\sum_{t= 1}^{n}[({\widetilde{{\mathbf{T}}}}_t^{N})^2+{\widetilde{S}}_t^{N}]+\sum_{t= n+1}^{N}({{\mathbf{T}}}_t)^2\\ Z_{00}&=\sum_{t= 0}^{n}[({\widetilde{{\mathbf{T}}}}_t^{N})^2+{\widetilde{S}}_t^{N}]+\sum_{t= n+1}^{N- 1}({{\mathbf{T}}}_t)^2\\ Z_{10}&=\sum_{t= 1}^{n}({\widetilde{{\mathbf{T}}}}_t^{N} {\widetilde{{\mathbf{T}}}}_{t-1}^{N}+C_t) + {{\mathbf{T}}}_{n+1} {\widetilde{{\mathbf{T}}}}_{n}^{N}+\sum_{t=n+2}^{N}{{\mathbf{T}}}_t {{\mathbf{T}}}_{t-1}\\ F_{11}&=\sum_{t=1}^{n} {{\mathbf{F}}}_t{\widetilde{{\mathbf{T}}}}_t^{N}+\sum_{t= n+1}^{N}{{\mathbf{F}}}_t {{\mathbf{T}}}_t\\ F_{10}&=\sum_{t=1}^{n+1} {{\mathbf{F}}}_{t}{\widetilde{{\mathbf{T}}}}_{t-1}^{N}+\sum_{t=n+2}^{N}{{\mathbf{F}}}_{t} {{\mathbf{T}}}_{t-1}\\ F_{00}&=\sum_{t= 1}^{N}{{\mathbf{F}}}_t{{\mathbf{F}}}_t^{\rm T}. \end{aligned} $$ -
3.
Obtain Θ(j) using the following:
$$ \begin{aligned} \zeta^{(j)}&= \left[\sum_{t= 1}^n[({\widetilde{{\mathbf{T}}}}_t^N)^2+{\widetilde{S}}_t^N]+\sum_{t= n+1}^N{{\mathbf{T}}}_t^2\right]^{-1} \left[\sum_{t=1}^n{\widetilde{{\mathbf{T}}}}_{t|N}{{\mathbf{P}}}_t+\sum_{t= n+1}^N {{\mathbf{T}}}_t{{\mathbf{P}}}_t\right]\\ R^{(j)}&=N^{-1}\sum_{t= 1}^n\left[{\widetilde{S}}_t^N(\zeta^{(j)})^2+({{\mathbf{P}}}_t-\zeta^{(j)} {\widetilde{{\mathbf{T}}}}_t^N)^2\right] +N^{-1}\sum_{t=n+1}^N\left({{\mathbf{P}}}_t- \zeta^{(j)}{{\mathbf{T}}}_t\right)^2\\ \Upsilon^{(j)}&=\left(F_{11}^{\rm T}-F_{10}^{\rm T}Z_{10}/Z_{00}\right) \left(F_{00}-F_{10}F_{10}^{\rm T}/Z_{00}\right)^{-1}\\ \phi^{(j)}&=\left(Z_{10}-\Upsilon^{(j)}F_{10}\right)Z_{00}^{-1}\\ Q^{(j)}&=N^{-1}\left(Z_{11}-\phi^{(j)}Z_{10}- \Upsilon^{(j)}F_{11}\right)\\ \mu_0^{(j)}&={\widetilde{{\mathbf{T}}}}_0^N. \end{aligned} $$ -
4.
Repeat steps 2 and 3 until convergence. For the analysis presented in this paper, the algorithm is stopped when lnL P,T(Θ(j)) − ln L P,T(Θ(j-1)) < 0.0005.
To provide a final estimate of T t , the Kalman filter and smoother estimates are recalculated using Eqs. (7) and (8) with the estimate parameters \({\hat{\Theta}},\) for t = 1, 2, ..., n + m. At the time of convergence, one can also calculate the standard errors for \({\hat{\Theta}}.\) For parameters estimated using ln L P(Θ), the asymptotic variance of \({\hat{\Theta}}\) is defined as (Caines 1988, Chap. 7; Jensen and Petersen 1999):
where ∂2/∂Θ denotes the second derivatives with respect to Θ. In our application, we have used L P,T(Θ) in the estimation procedure and hence a reasonable estimate of the asymptotic variance can be obtained by replacing L P(Θ) with L P,T(Θ) in the above formula. More discussions of this can be found in Lee (2008). An analytical form of the asymptotic variance is generally hard to find and hence we calculated it numerically. The asymptotic variance can be used to provide confidence bounds for the estimated parameters. In particular, the 95% confidence bound for Θ is defined as \({\hat{\Theta}} \pm 1.96 \sqrt{\hbox{Var}({\hat{\Theta}})}.\) In our application, we are interested in the confidence bounds for the parameters δ GS , δ VOL and δ SOL . These bounds can provide a detection assessment of the importance of the response to the GS, VOL and SOL forcing on the hemispheric mean temperature.
Rights and permissions
About this article
Cite this article
Lee, T.C.K., Zwiers, F.W. & Tsao, M. Evaluation of proxy-based millennial reconstruction methods. Clim Dyn 31, 263–281 (2008). https://doi.org/10.1007/s00382-007-0351-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00382-007-0351-9