What is the link between temperature, carbon dioxide and methane? A multivariate Granger causality analysis based on ice core data from Dome C in Antarctica

Larsson, Rolf; Persson, Erik K.

doi:10.1007/s00704-023-04444-x

What is the link between temperature, carbon dioxide and methane? A multivariate Granger causality analysis based on ice core data from Dome C in Antarctica

Original Paper
Open access
Published: 09 May 2023

Volume 153, pages 49–55, (2023)
Cite this article

Download PDF

You have full access to this open access article

Theoretical and Applied Climatology Aims and scope Submit manuscript

What is the link between temperature, carbon dioxide and methane? A multivariate Granger causality analysis based on ice core data from Dome C in Antarctica

Download PDF

Rolf Larsson¹ &
Erik K. Persson²

2136 Accesses
13 Altmetric
1 Mention
Explore all metrics

Abstract

This paper continues the work of Kang and Larsson Theoretical and Applied Climatology 116:537–548 (2014) What is the link between temperature and carbon dioxide levels? A Granger causality analysis based on ice core data. Theoretical and Applied Climatology 116:537-548, by adding methane to the study of causality between temperature and carbon dioxide. The data used goes 800,000 years back in time which is possible due to it being extracted from ice cores located at Dome C in Antarctica. First we use linear interpolation to make the three data sets equidistant to be able to employ statistical methods. We adjust for a deterministic trend and find the best model fit to be an autoregressive model with lag two and a piecewise quadratic trend. By employing multivariate Granger causality tests we find strong evidence that temperature, carbon dioxide and methane all Granger cause each other in both directions i.e. carbon dioxide Granger causes temperature and temperature Granger causes carbon dioxide, etc. This is in accordance with the findings of Kang and Larsson, with the extension that we can add methane to establish a trivariate feedback system between temperature, carbon dioxide and methane.

On the causal structure between CO₂ and global temperature

Article Open access 22 February 2016

Causal feedbacks in climate change

Article 30 March 2015

A New method for identifying possible causal relationships between CO₂, total solar irradiance and global temperature change

Article 20 November 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Global warming has for quite some time been one of humanity’s biggest challenges and, to a very high degree, still is. The aim of this article is to study the long term relationship between temperature, carbon dioxide and methane from ice core data from Antarctica, to see whether there exists a statistically significant causality between the three variables in question using the multivariate Granger causality test. This is a continuation of the paper from Kang and Larsson (2014) where they found a long term relationship between temperature and carbon dioxide concentration. In the present paper, we bring in methane concentration into the analysis.

References to previous related work are given in Kang and Larsson (2014). Here, we complement this with a brief account of more recent research on related subjects.

Two papers that study the impact of carbon dioxide and methane on temperature, and vice versa, using ice core data, are Davidson et al. (2016); Stips et al. (2016). The statistical model of Davidson et al. (2016) resembles our model. They also use a vector autoregressive (VAR) model which takes the deterministic trends (Milankovitch cycles) into account. But in contrast to Kang and Larsson (2014) and the present paper, these trends are modeled via known orbital variables. Another difference is that Ice volume is also brought in as an additional, fourth variable. Granger causality is not tested via parameter restriction. Instead, the analysis focuses on prediction intervals for changes of the dependent variable (e.g. temperature) when one of the covariates (e.g. carbon dioxide concentration) increases its value. In this way, a causal effect of temperature on carbon dioxide and methane is found, but not the other way around.

The same issue is under study in Stips et al. (2016). The methodological tools here are differential equations and Shannon entropy. Data used are both recent (from 1850 and on) and from ice cores. For more recent data, it is found that greenhouse gases, in particular carbon dioxide, are the main causal drivers of the temperature rise. But for ice core data, the cause-effect direction is reversed. The explanation is that the feedback of greenhouse gases to temperature changes seems to be slower that the fast response of temperature to changes of greenhouse gas concentrations. This finding is corroborated by Muryshev et al. (2019).

Other papers of special interest include the following: Faes et al. (2017) modifies the Granger causality testing method in VAR models and conclude that, when analyzing ice core data, the proof for carbon dioxide rise to cause temperature rise is stronger than for the opposite causation. The papers by Seip et al. (2018); Vakulenko et al. (2016) point out, using different methods, that causalities between carbon dioxide concentration and temperature may go in different directions over different time periods. Finally, Estrada and Perron (2019) analyze data from the last centuries and find that breaks in carbon dioxide trend to precede breaks in temperature trend, which they interpret as a sign of causality.

The structure of the present paper is as follows: in Section 2 a short description of the data is given followed by a briefing of the data processing necessarily to be able to employ traditional statistical methods. Thereafter, in Section 3, the statistical analysis used is described. Section 4 initiates with a briefing of the results regarding model selection, optimizing break points and some residual analysis, culminating in the results of the multivariate Granger causality tests. Finally in Section 5 a discussion of the results and comments on future work takes place.

The statistical analysis was performed using the statistical computer software R version 3.2.3.

2 Background and data

The source of data, National Centers for Environmental Information (NCEI) monitored by National Oceanic and Atmospheric Administration (NOAA) is a world wide climate and historical weather database. More specifically the data used in this study was collected at Dome C (by European Project for Ice Coring in Antarctica (EPICA)) which is located at east Antarctica. The temperature, carbon dioxide and methane series all go from, approximately, 800,000 years back in time to around 100 years before present. This is possible by drilling a 3,270 m ice core from which the climate and gas data can be retrieved. Using the attribute of ice cores to store small air bubbles makes it possible to access information about gas compositions in the past. By analyzing different isotopes it is also achievable to reconstruct the temperature level, cf Alley (2010).

2.1 Data processing

The three data sets (temperature, CO$_{2}$ and CH$_{4}$) all span over roughly the same time period, however, there are large differences in number of observations and they are not observed at the same time. Henceforth, to be able to properly analyse the data, a simple linear interpolation was used to make the observations of the three series equidistant. This was done by using the following formula

$$\begin{aligned} x = x_{1} + \frac{t-t_{1}}{t_{2}-t_{1}}(x_{2}-x_{1}). \end{aligned}$$

(1)

Here $x_{1}$ is observed at time $t_{1}$ and $x_{2}$ at time $t_{2}$ hence (1) will calculate x at time $t_{1}< t < t_{2}$. The interval was chosen in agreement with Kang and Larsson (2014) and was set to be 1,000 years. The number of interpolation points is a trade-off: choosing too few might result in a loss of information whereas too many will lead to artificiality resulting in a possible bias.

2.2 Structural break points

By inspecting Figs. 1, 2 and 3 several obvious structural break points are observed and the pattern seems to be similar among the three series. As suggested by Kang and Larsson (2014) the eight initial break points shown in Table 1 will be used for the three series. The break points will later on be modified to optimize fit and the result of that modification is also shown in Table 1.

3 Statistical analysis

In accordance with Kang and Larsson (2014) a number of trends were found by inspecting Fig. 1. It is quite obvious that there is a resemblance between the shapes of the three series. An exponential trend seems appropriate to consider, hence it might be reasonable to combine autoregressive terms, in order to account for the serial correlation, with piecewise quadratic functions to explain the deterministic exponential behavior. This reasoning leads to the combined model

$$\begin{aligned} \begin{pmatrix} X_{1t} \\ X_{2t} \\ X_{3t} \end{pmatrix}= & {} \begin{pmatrix} \nu _{1t} \\ \nu _{2t} \\ \nu _{3t} \end{pmatrix} + \sum \limits _{k=1}^{p}{\begin{pmatrix} \delta _{k11} &{} \delta _{k12} &{} \delta _{k13} \\ \delta _{k21} &{} \delta _{k22} &{} \delta _{k23} \\ \delta _{k31} &{} \delta _{k32} &{} \delta _{k33} \end{pmatrix}}\begin{pmatrix} X_{1,t-k} \\ X_{2,t-k} \\ X_{3,t-k} \end{pmatrix} \nonumber \\ {}{} & {} + \begin{pmatrix} \epsilon _{1t} \\ \epsilon _{2t} \\ \epsilon _{3t} \end{pmatrix}, \end{aligned}$$

(2)

where $(\epsilon _{1t},\epsilon _{2t},\epsilon _{3t})'$ are iid with expectation 0 and the deterministic terms are

$$\begin{aligned} \nu _{it}= & {} \sum \limits _{j=1}^{7}{\Big \{d_{ij0} + d_{ij1}(t-T_{j}) + d_{ij2}(t-T_{j})^{2}\Big \}}\nonumber \\ {}{} & {} \times I\Big \{T_{j} \le t < T_{j+1}\Big \}, \end{aligned}$$

(3)

where i = 1,2,3 (temperature, CO$_{2}$, CH$_{4}$) and I $\big \{T_{j} \le t < T_{j+1}\big \}$ is 1 if $T_{j} \le t < T_{j+1}$ and 0 otherwise. In (3) $d_{ij0}$ is the constant term, $d_{ij1}$ the linear and $d_{ij2}$ the quadratic. p in (2) is the number of lags.

3.1 Akaike information criterion

The Akaike information criterion (AIC), cf Akaike (1974), measures how well a model fits the data. The lower value of AIC the better the fit. This was taken into consideration when finding the optimal model. The formula is

$$\begin{aligned} AIC = 2k - 2log\hat{L}, \end{aligned}$$

where k is the number of parameters in the model and $\hat{L}$ is the maximum likelihood.

Table 1 Initial and modified break points for temperature, CO$_{2}$ and CH$_{4}$

Full size table

Table 2 AIC of the univariate models. Pqt is short for piecewise quadratic trend and mbp is short for modified break points

Full size table

Table 3 Residual sum of squares of forecast errors in the out-of-sample analysis. Pqt is short for piecewise quadratic trend and mbp is short for modified break points

Full size table

Table 4 Estimated parameters of the trivariate temperature series with CO$_{2}$ and CH$_{4}$ as the second and third variable. One, two or three stars indicate significance level at 5, 1 and 0.1%, respectively

Full size table

3.2 Out of sample analysis

In addition to AIC an out-of-sample analysis was employed to help decide the final model. This was done by estimating the models using all of the data with the exception of the last 50 observations. Thereafter the ruled out observations were predicted from the estimated models and, lastly, the residual sums of squares (RSS) of deviations from the observations were calculated.

3.3 Granger causality test

The process $X_{2}$ Granger causes $X_{1}$ if $\delta _{k12}$ $\ne$ 0 for some k in (2), and similarly, $X_{1}$ Granger causes $X_{2}$ if $\delta _{k21}$ $\ne$ 0 for some k as first described by Granger (1969). This was later developed into a formal F test by Sims (1972). In this paper, we prefer to use an alternative test proposed by Lütkepohl (1982). He used the likelihood ratio test

$$\begin{aligned} \Lambda = T(ln RSS_{r} - ln RSS_{u}), \end{aligned}$$

(4)

where T is the number of observations, RSS$_{r}$ and RSS$_{u}$ are the residual sums of squares of the restricted and unrestricted model, respectively. Then Lütkepohl compared that to the $\chi ^2$ distribution. The number of degrees of freedom is the number of restricted parameters.

For every trivariate model Lütkepohl carried out three tests. E.g. in our case, for the trivariate temperature model the first test would have the null hypothesis that the lag coefficients for CO$_{2}$ and CH$_{4}$ would be zero ($\delta _{k12}$ = 0 and $\delta _{k13}$ = 0, for k = 1,2), versus the alternative hypothesis that these two lag coefficients are nonzero. This test determines if at least one of $\delta _{k12}$ or $\delta _{k13}$ is nonzero and therefore has an impact on temperature. The number of degrees of freedom is 2p.

The second test has the null hypothesis that only $\delta _{k12}$ is zero versus the same alternative. The third test has the null hypothesis that only $\delta _{k13}$ is zero versus the same alternative. The latter two tests help determine the significance of the two variables one by one. The number of degrees of freedom is p.

4 Results

To investigate whether the three reconstructed series were stationary or not the augmented Dickey-Fuller test (Dickey and Fuller 1979) was used. The null hypothesis of nonstationarity was tested against the alternative of stationarity. The test gave rejection in all the three cases (test statistics $-4.8$, $-4.6$ and $-6.1$ for the temperature, carbon dioxide and methane series, respectively). The Akaike Information Criterion (AIC) was calculated for six different models for each of the three series. That is (2) with p = 1 with and without the piecewise quadratic trend (3) and (2) with p = 2 with and without the piecewise quadratic trend (3). The result of this can be seen in Table 2 where the lowest AIC and therefore the best fit was given by AR(2) with piecewise quadratic trend for temperature and CO$_{2}$. However, the best fit for CH$_{4}$ was given by AR(1) with piecewise quadratic trend, though by a very small margin. Furthermore by inspecting Table 3, where residual sum of squares (RSS) of forecast errors in the out-of-sample analysis were calculated for the six different models, similar results were given with the exception that AR(2) with piecewise quadratic trend was the optimal model for temperature and CH$_{4}$ while AR(1) with piecewise quadratic trend was slightly better for CO$_{2}$. With these results in mind the model with piecewise quadratic trend, modified break points and lag length two was chosen for all three series as the final model, considering that it was the model that overall produced the best fit when weighing in AIC and the out-of-sample analysis.

The parameter estimates for the final trivariate models with modified break points for temperature, CO$_{2}$ and CH$_{4}$ can be seen in Tables 4, 5 and 6, respectively. The modified break points, together with the initial break points, can be seen in Table 1. They were produced for every series separately to optimize the fit. The AIC and RSS of the models with modified break points are also shown in Tables 2 and 3, respectively.

Table 5 Estimated parameters of the trivariate CO$_{2}$ series with temperature and CH$_{4}$ as the second and third variable. One, two or three stars indicate significance level at 5, 1 and 0.1%, respectively

Full size table

Table 6 Estimated parameters of the trivariate CH$_{4}$ series with temperature and CO$_{2}$ as the second and third variable. One, two or three stars indicate significance level at 5, 1 and 0.1%, respectively

Full size table

The modified break points were calculated according to a procedure presented in Kang and Larsson (2014) where, initially, the last^{Footnote 1} three break points ($T_{5}$, $T_{6}$ and $T_{7}$) were fixed. Thereafter an iteration is initiated to find the set of $T_{2}$, $T_{3}$ and $T_{4}$ that minimizes the RSS within ±10 of the previous set. Then it proceeds by fixing the newly updated set of $T_{2}$, $T_{3}$ and $T_{4}$ and employing the same iteration to $T_{5}$, $T_{6}$ and $T_{7}$ until optimized.

The autocorrelation function and partial autocorrelation function of the residuals for the three reconstructed series were calculated. No obvious patterns were detected. These figures, as well as estimated parameters for all univariate and bivariate models, may be obtained from the authors upon request.

4.1 Multivariate Granger causality tests

The results from the multivariate Granger causality tests are shown in Table 7, and it is clear that we can reject all the null hypotheses at any realistic significance level. This means that there is strong empirical evidence that temperature Granger causes CO$_{2}$, CO$_{2}$ Granger causes CH$_{4}$, CH$_{4}$ Granger causes temperature, CO$_{2}$ Granger causes temperature, CH$_{4}$ Granger causes CO$_{2}$ and temperature Granger causes CH$_{4}$.

Table 7 Likelihood ratio tests for Granger causality

Full size table

5 Discussion and conclusions

First we processed the data by linear interpolation to make classical statistical methods easier to employ. Thereafter we found the best model fit to be an autoregressive model with lag two and a piecewise quadratic trend. The residual analysis showed no obvious patterns.

The main results in this paper strengthen the findings of Kang and Larsson (2014) that temperature Granger causes CO$_{2}$ and CO$_{2}$ Granger causes temperature. We can also further expand that result by presenting empirical evidence that there exists a whole feedback system between temperature, CO$_{2}$ and CH$_{4}$ where everyone one of them affects the other. This means that the continuous emission of the greenhouse gases CO$_{2}$ and CH$_{4}$ will, indeed, lead to an increase of the surface temperature.

For further studies on the subject one could consider including more variables, like nitrous oxide (N$_{2}$O) concentration into the analysis, cf Fisher et al (2019). N$_{2}$O is, just like CO$_{2}$ and CH$_{4}$, a greenhouse gas so it would be interesting to see in what way it would interact with our established trivariate feedback system and N$_{2}$O concentration is retrievable from ice cores hence it would not be difficult to carry through.

Ideally all the data would have been collected at the same time and the three series would have been equidistant from the start to avoid the necessity of data processing i.e artificiality but, alas, this was not the case.

Furthermore one could always consider different, more advanced, statistical models e.g. nonlinear or continuous time models as suggested by Kang and Larsson (2014).

Notes

The very last and first break point ($T_{8}$ and $T_{1}$) are left out of this optimization since they are already fixed as the endpoint and the starting point, respectively.

References

Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 6:716–723
Article Google Scholar
Alley RB (2010) Reliability of ice-core science: historical insights. Journal of Glaciology 56:1095–1103
Article Google Scholar
Davidson JEH, Stephenson DB, Turasie AA (2016) Time series modeling of paleoclimate data. Environmetrics 27:55–65
Article Google Scholar
Dickey D, Fuller WA (1979) Distribution of the estimators for autoregressive time series with a unit root. J Am Stat Assoc 74:427–431
Google Scholar
Estrada F, Perron P (2019) Causality from long-lived radiative forcings to the climate trend. Annals of the New York Academy of Sciences 1436:195–205
Article Google Scholar
Faes L, Nollo G, Stramaglia S, Marinazzo D (2017) Multiscale Granger causality. Physical Review E 96:042150
Article Google Scholar
Fisher H, Schmitt J, Bock M, Seth B, Joos F, Spahmi R, Lienert S, Battaglia G, Stocker BD, Schilt A, Brook EJ (2019) ${\rm N}_2$O changes from the last glacial maximum to the preindustrial- Part 1: Qualitative reconstruction of terrestial and marine emissions using ${\rm N}_2$O stable isotopes in ice cores. Biogeosciences 16:3997–4021
Article Google Scholar
Granger CWJ (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37:424–438
Article Google Scholar
Kang J, Larsson R (2014) What is the link between temperature and carbon dioxide levels? A Granger causality analysis based on ice core data. Theoretical and Applied Climatology 116:537–548
Article Google Scholar
Lütkepohl H (1982) Non-causality due to omitted variables. Journal of econometrics 19:367–378
Article Google Scholar
Muryshev KE, Eliseev AV, Denisov SN, Mokhov II, Arzhanov MM, Timazhev AV (2019) Time lag between changes in global temperature and atmospheric C${\rm O}_2$ content under anthropogenic emissions of C${\rm O}_2$ and C${\rm H}_4$ into the atmosphere. IOP Conference Series: Earth and Environmental Science 231:012039
Google Scholar
Seip KL, Gron O, Wang H (2018) Carbon dioxide precedes temperature change during short-term pauses in multi-millenial palaeoclimate records. Palaeogeography, Palaeoclimatology, Palaeoecology 506:101–111
Article Google Scholar
Sims CA (1972) Money, income and causality. Am Econ Rev 62:540–552
Google Scholar
Stips A, Macias D, Coughlan C, Garcia-Gorriz E, San Liang X (2016) On the causal structure between C${\rm O}_2$ and global temperature. Nature, Scientific Reports 6:21691
Article Google Scholar
Vakulenko NV, Kotlyakov VM, Sonechkin DM (2016) Lead-lag relationships between atmospheric trends of temperature and carbon dioxide concentrations during the Pilocene. Geography 467:423–426
Google Scholar

Download references

Funding

Open access funding provided by Uppsala University.

Author information

Authors and Affiliations

Uppsala University, SE-751 06 Uppsala, Uppsala, P.O. Box 480, Sweden
Rolf Larsson
Uppsala, P.O. Box 6363, Sweden
Erik K. Persson

Authors

Rolf Larsson
View author publications
You can also search for this author in PubMed Google Scholar
Erik K. Persson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rolf Larsson.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Larsson, R., Persson, E.K. What is the link between temperature, carbon dioxide and methane? A multivariate Granger causality analysis based on ice core data from Dome C in Antarctica. Theor Appl Climatol 153, 49–55 (2023). https://doi.org/10.1007/s00704-023-04444-x

Download citation

Received: 18 November 2019
Accepted: 23 March 2023
Published: 09 May 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00704-023-04444-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

What is the link between temperature, carbon dioxide and methane? A multivariate Granger causality analysis based on ice core data from Dome C in Antarctica

Abstract

Similar content being viewed by others