The increasingly strong evidence of climate change has determined a growing concern on the impact of economic activities on global climate change over the last two decades. This has resulted in a rich body of literature examining the nature of the linkages between per capita income and greenhouse gas emissions usually proxied by per capita \(\hbox {CO}_{2}\) (see Holtz-Eakin and Selden 1995; Cole et al. 1997; Galeotti et al. 2006; Barassi and Spagnolo 2012 among others)Footnote 1. The literature has mostly found that per capita \(\hbox {CO}_{2}\) emissions have a direct and positive relationship with per capita incomeFootnote 2. However, these studies do not say much about the spatial distribution of \(\hbox {CO}_{2}\) emissions and the way that this may evolve over time. Although it may not be of significant importance from the point of view of the impact on the environment, the spatial distribution of \(\hbox {CO}_{2}\) emissions has significant implications for climate change modelling and especially for the success of negotiations in multilateral climate agreements (Aldy 2006). For these reasons, more recently, building on the literature on income convergence, a seemingly rich body of literature on convergence of \(\hbox {CO}_{2}\) emissions has developed (see Pettersson et al. 2014 for a review).

Convergence of per capita \(\hbox {CO}_{2}\) emissions has been called for by a number of (non-governmental) lobby groups as well as members of the academic and scientific community based on arguments of fairness and equity, specifically on the principle of allocating to each individual the same ‘right to pollute’ (Stegman 2005; Böhringer and Welsch 2006; Mackey and Rogers 2015). In this sense, countries with low initial per capita emissions are “allowed” to experience growing per capita emissions thereby catching up with more highly polluting countries which are expected to reduce their per capita emissions. In practice, the principle that every individual has an equal right to use the atmosphere as a reservoir for greenhouse gas emissions, could be used for example, to set a long-term emission budget and then sharing this fund among countries such that per capita emissions are levelled up in the long run (absolute convergence) or are aimed at convergence within a targeted time horizon to different steady states accounting for country heterogeneity (conditional convergence)Footnote 3.

This paper focuses on investigating the stochastic convergence of \(\hbox {CO}_{2}\) emissions. In particular we aim at contributing to the literature by investigating the occurrence of (possibly very slow) stochastic convergence, first, among a set of 28 OECD economies and then on an enlarged sample which will also include 4 out of 5 of the BRICS (Brazil, Russia, India, China and South Africa) over the last 60 years. To this end, we first use several variants of the local Whittle (LW hereafter) fractional integration tests on the relative per capita \(\hbox {CO}_{2}\) emissions to assess their order d of (fractional) integration, I(d) (such that \(1 \le d \le 0\)) thereby allowing for the detection of convergence which occurs at a slower pace than it would be detectable by confining the analysis within the \(I(0)-I(1)\) dichotomy. Then, we adopt a rigorous statistical testing framework to shed light on possible slow convergence with changes in the trend function of the individual series as opposed to the lack of convergence. Specifically, we adopt two different but complementary tests to distinguish between fractional integration (slow convergence) and structural changes to an otherwise I(0) series. We shall apply these techniques to samples of countries comprising subsets of OECD countries as well as the BRICS and emerging economies.

Within the rich literature on stochastic convergence, Strazicich and List (2003) found significant evidence of convergence of per capita \(\hbox {CO}_{2}\) emissions. In contrast, Aldy (2006), using a similar technique for 23 OECD countries over the period 1960–1999, found mixed results. However, in the same paper, when using a larger sample of 88 countries, Aldy (2006) found no evidence of convergence. Barassi et al. (2008) employed a greater variety of tests than earlier studies, testing both the null of a unit root and the null of stationarity for both individual relative emissions series and for the OECD panel as a whole allowing for cross-sectional dependencies within the panel. Little evidence was found suggesting that per capita \(\hbox {CO}_{2}\) emissions, within the OECD, are converging. Westerlund and Basher (2008) extended the time series to a sample going from 1870 to 2002 and also allowed for cross-sectional dependencies. Using data on \(\hbox {CO}_{2}\) emissions for 28 developed and developing countries they did find evidence of stochastic convergence. Furthermore, Romero-Ávila (2008) used panel stationarity tests allowing for structural breaks and found evidence of convergence while Panopoulou and Pantelidis (2009) suggested that in recent years there might have been two separate convergence patterns with countries with similar fundamentals converging to different equilibria. It is rather clear from the above that there exist very mixed evidence on the convergence of \(\hbox {CO}_{2}\) particularly within the OECD.

In the last decade, with energy and climate change debate, an extensive literature on fractional integration has emerged, which goes well beyond the \(I(0)-I(1)\) dichotomy and considers the possibility that the variables of interest may follow a long memory I(d) process with \(0<d<1\). This long-range dependence is characterized by a hyperbolically-decaying autocovariance function, and by a spectral density that approaches infinity as the frequency tends to zero. The intensity of this phenomenon is measured by a differencing parameter (Palma 2007). While this more flexible approach has been widely adopted in the macroeconomics literature, it is relatively new in the literature on energy (Elder and Serletis 2008; Lean and Smyth 2009; Gil-Alana et al. 2010; Apergis and Tsoumas 2011, 2012; Barros et al. 2012a, b among others). The results from these fractional integration tests generally suggested that the energy variables (energy demand and supply) considered are stationary but all exhibit long memory and thus, shocks have transitory albeit long-lasting effects. These results are particularly informative as \(\hbox {CO}_{2}\) emissions have a direct relationship with energy consumption. As a result, a growing literature on the of \(\hbox {CO}_{2}\) emissions has also developed (Barassi et al. 2011 and Gil-Alana et al. 2015).

The evidence from this literature is that in some cases the existence of unit roots cannot be rejected, but, in most other cases, although emissions are still covariance non-stationary, they may be reverting toward the cross sectional mean. Overall, the evidence seems to go in the direction of identifying very long memory in \(\hbox {CO}_{2}\) emissions to the extent that even transitory policy shocks can potentially lead to very long lasting, if not permanent effects.

The development and application of tests for fractional integration finds its rationale in the potential bias associated to standard unit root tests in cases where occasional structural breaks in the deterministic terms are neglected (Granger and Hyung 2004). Also, unit root tests are known to have poor power near the alternative of stationarity, and moreover, they cannot provide detailed information about the level of persistence of the series. Fractional integration might possibly be the reason why standard unit roots or stationarity tests have often failed to offer consistent findings with regard to the order of integration of \(\hbox {CO}_{2}\) emissions, their mean reversion and convergence.

Fractional integration of \(\hbox {CO}_{2}\) emissions implies high persistence (and even covariance non-stationarity) with mean/trend reversion in the long run. Recent papers (Barassi et al. 2011 and Gil-Alana et al. 2015) motivated the use of fractional integration modelling, among other reasons, by the presence of occasional breaks in the \(\hbox {CO}_{2}\) series which, otherwise, would only be weakly autocorrelated. This is particularly relevant in the case of emissions as country specific (or international) changes in environmental regulations, or historical events, although infrequent, may have led to shifts in the deterministics of a country’s per capita \(\hbox {CO}_{2}\) series. Granger and Hyung (2004) claimed that fractional integration and/or infrequent breaks may be hard to distinguish from one another and so, fractional integration modelling strategies have to be preferred in order to produce more accurate forecasts. Several ways to disentangle the presence of long memory from structural changes have now been developed (see Diebold and Inoue 2001; Berkes et al. 2006; Shimotsu 2006; Ohanissian et al. 2008; Aue and Horváth 2013; Mayoral 2012).

In this paper we initially estimate the order of (fractional) integration of the \(\hbox {CO}_2\) emissions series for 28 OECD countries over the period 1950–2013, by means of the estimators in Robinson (1995) and Shimotsu and Phillips (2006). Subsequently, using the techniques of Berkes et al. (2006) and Mayoral (2012), we test whether the estimated order of integration \(\hat{d}\) arises from the correct inference about the persistence of the relative \(\hbox {CO}_{2}\) emissions series, or rather it is the result of a structural change in the deterministic components of the series. The analysis is repeated for an enlarged sample comprising the 28 OECD economies already used and 4 out of the 5 BRICS.

Our analysis contributes to the literature in two ways: first, we test the constructed relative per capita \(\hbox {CO}_{2}\) for fractional integration as we consider the possibility that relative per capita emissions may actually be mean/trend reverting long memory processes. Second, we use two recently developed tests for long memory versus structural breaks, whose presence, if overlooked, might well undermine our understanding about the existence of convergence of relative emissions and also its speed.

The layout of the paper is as follows. Section 2 outlines the econometric modelling approach. Section 3 describes the data and presents the empirical findings. Section 4 summarises the main findings, and a summary offers some concluding remarks.


Testing for Convergence

Following Strazicich and List (2003), we begin our analysis by defining the natural logarithm of country i relative per capita emissions \((y_{it})\) as:

$$\begin{aligned} y_{it}= ln\left( \frac{{ PCE}_{it}}{N^{-1}\sum _{i=1}^{N}{} { PCE}_{it}}\right) , \end{aligned}$$

where \({ PCE}_{it}\) is the per capita emissions in country i and year t and N is the total number of countries. To test for the null of non-convergence we test whether the relative emissions in country i, \(y_{it}\) contain a unit root (or are indeed trend stationary or mean/trend reverting).

It is well known that standard unit root tests are grossly oversized when applied to series which have structural changes in their deterministics, and also that they suffer from low power when a series has roots near unity. This is to say that unit root tests are often not able to distinguish between highly persistent and infinitely persistent series. Fractional integration tests offer a more general framework within which it is possible to test whether \(\hbox {CO}_{2}\) emissions are indeed infinitely persistent I(1) or are instead series which display high persistence (and possibly covariance non-stationarity) but are mean/trend reverting in the long run such that their order of integration is I(d) with \(0<d<1\). A simple fractionally integrated representation of the individual relative emissions series including a trend function can be written as:

$$\begin{aligned} \Delta ^{d}y_{it}= c_{i0} + \gamma _{i}t + \epsilon _{it} \end{aligned}$$

where \({\epsilon _{it}}\) is a zero mean white noise process, d is the order of integration of the series, and \(c_{i0} + \gamma _{i}t \) is a deterministic trend function.

Here, we estimate the order of (fractional) integration \(\hat{d}\) of the term on the left hand side of Eq. (2) using the Local Whittle (LW) Estimator and its variants and subsequently a test for the null hypothesis \(H_{0}: \hat{d}=1\) versus the one sided alternative that \(H_{1}: \hat{d}<1\) is conducted to test for the null of a unit root versus the alternative of mean reversion and long memory with either bounded or explosive variance (depending on the value of \(\hat{d}\)). Provided that mean reversion of relative emissions holds this can then be taken as evidence of convergence of \(\hbox {CO}_{2}\) emissions.

We initially estimate d by means of the semi-parametric Gaussian LW estimator (see Robinson 1995), which is developed under the slightly restrictive assumption that \(y_{t}\) is covariance stationary. Specifically, for \(d_{0} \in (-\,1/2,1/2)\) and under other appropriate assumptions and conditions, Robinson (1995) and Shimotsu (2006) show that \(\sqrt{m} (\hat{d}-d_{0}) \rightarrow N(0,1/4)\), where \(m=n^{\alpha }\) with n being the sample size and \(\alpha \) being a truncation parameter. However, Phillips and Shimotsu (2004) show, when \(d>1/2\) the LW estimator exhibits non-standard behaviour, and although it is consistent for \(d \in (1/2,1]\) and asymptotically normal for \(d \in (1/2,3/4)\), it has non-normal asymptotic distribution for \(d \in [3/4,1]\) and \(d>1\), but also converges to 1 in probability and is inconsistent.

To circumvent these potential issues, we use the Exact Local Whittle (ELW) estimator and its variants (Shimotsu and Phillips 2005; Shimotsu 2006), whose asymptotics are based on the exact frequency domain or its estimated version, the Feasible Exact Local Whittle estimator (FELW). The ELW and FELW are computationally more demanding than the LW but they have been shown to be consistent and asymptotically normally distributed for any value of d and therefore are valid under a wider range of cases (see Shimotsu 2006). In applying the class of LW estimators, it is crucial to choose the truncation parameter \(\alpha \), where \(m=n^a\) as the magnitude of m governs the speed of convergence of the LW-type of estimators to their asymptotic distribution.

Shimotsu and Phillips (2005) and Shimotsu (2006) discuss the importance of the choice of m, explaining that in the case of LW and ELW estimation, m has to grow fast for \(\hat{d}\) to be consistent, but also that a too large value of m may induce a bias to the estimator from the short run dynamics. Given the relatively small size of our sample, this rule of thumb suggests that we should choose a value of \(\alpha \) in the interval [0.65, 0.80]. Having obtained the estimates for \(\hat{d}\) for the individual \(\hbox {CO}_{2}\) relative emissions through the LW, ELW and FELW estimators, we aim to test the following hypotheses for all the series:

$$\begin{aligned} \left\{ \begin{matrix} H_0: \quad \hat{d}_{i}=1 \\ H_1: \quad \hat{d}_{i}<1 \end{matrix}\right. \end{aligned}$$

The Whittle-based test statistic for the null of a unit root versus the alternative of long memory is:

$$\begin{aligned} { LWT}=\sqrt{m}(\hat{d}-1) \overset{a}{\sim } N(0,\sigma ^{2}). \end{aligned}$$

Obviously, given the nature of the test and the fact that we shall apply it to a relatively small sample of \(n=64\), it is necessary to compute the empirical distributions of the different local Whittle based tests. Table 1 displays the exact distribution of LW, ELW, FELW and FELW with de-trending estimators for \(n=64\) and truncation parameters ranging between 0.65 and 0.8.

Table 1 Exact distribution of the local Whittle estimators

Having estimated the order of integration and assessed the occurrence (or lack) of convergence of the individual relative \(\hbox {CO}_{2}\) emissions, the second step of our analysis involves testing whether these series indeed have long or infinite memory or instead, their persistence is the result of a structural change in their deterministic terms and therefore they are spurious long (infinite) memory processes. There exist a substantial amount of literature on the testing for unit roots against the alternative of structural changes in the deterministics of an otherwise stationary process (see for example Perron 1989; Banerjee et al. 1992; Harvey et al. 2015). However there are only few studies proposing testing procedures to distinguish between long memory and spurious long memory induced by changes in the deterministic components of time series processes (see Berkes et al. 2006; Aue et al. 2009; Aue and Horváth 2013; Mayoral 2012).

In what follows we introduce two of these testing procedures which we shall apply to our data in the attempt to understand whether, once relative emissions have been identified as I(d) long or I(1) infinite memory processes, the implied slow convergence or the lack of it are genuine or are instead the result of a mean change in an otherwise stationary process, which would indicate a faster convergence (with breaks).

Tests for Long Memory Versus Spurious Long Memory

Once the fractional integration order \(\hat{d_i}\) indicates that a time series such as \(y_{it}\) above is a long (but perhaps not infinite) memory process, it is not unambiguous whether it is indeed a long range dependent process or whether it is a short memory stationary process with a mean change. Even though both of two cases indicate convergence of \(y_{it}\), a stationary process with mean change suggests that per capita \(\hbox {CO}_2\) emissions indeed converge at a faster speed than a long memory process.

Although the occurrence of long memory processes has been frequently found, mostly in high frequency financial data, it is actually rare in low frequency economic series. Indeed, low frequency economic data, especially over a relatively short time horizon, are more likely to be non-stationary. Once again, these non-stationary processes, i.e. \(d_i\in (0.5, 1.5)\), can be real non-stationary, or a stationary process with a change in mean. In this subsection, we introduce a test proposed by Mayoral (2012) to investigate this issue. Our choice finds its justification in that the size and power of this test are found to be surprisingly reliable in small samples according to our simulation results.

The test assumes that the each of processes of interest is generated as follows:

$$\begin{aligned} y_t=\beta ' \cdot Z_t+\delta \cdot V_t(w)+x_t. \end{aligned}$$

where \(Z_t\) is a deterministic component which can either be a constant or a trend functionFootnote 4, \(V_t(w)\) equals to \(Z_{t-k}\) after break point kFootnote 5, and is equal to zero before the break point, and the stochastic term \(x_t\) is defined as,

$$\begin{aligned} (1-L)^{d}x_t= \epsilon _t, \quad \epsilon _t \sim N(0,\sigma ^{2}), \end{aligned}$$

Based on this data generating structure, the hypotheses tested are,

$$\begin{aligned} \left\{ \begin{matrix} H_0: \quad d\in D_0, \quad \delta =0, \quad D_0\subset (0.5,1.5) \\ H_1: \quad d=0, \quad \delta \quad { unconstrained} \end{matrix}\right. \end{aligned}$$

To test these hypotheses, Mayoral (2012) proposes a semi-parametric test, and constructs a most powerful invariant (MPI) statistic through a comparison of the log-likelihoods under \(H_0\) and \(H_1\). The consistent statistics \(R_b^f(\hat{d}_T)\) has the following form:

$$\begin{aligned} R_b^f(\hat{d}_T)=T^{1-2\hat{d}_T}\left( \frac{\hat{\lambda }^2}{\hat{\gamma }_0}\right) ^{-1}\frac{inf_{\omega \in \Omega } (y-Z\widetilde{\beta }-V(\omega )\widetilde{\delta })'(y-Z\widetilde{\beta }-V(\omega )\widetilde{\delta })}{(\Delta ^{\hat{d}_T}y-\Delta ^{\hat{d}_T}Z\hat{\beta })'(\Delta ^{\hat{d}_T}y-\Delta ^{\hat{d}_T}Z\hat{\beta })}, \end{aligned}$$

In the above statistic, the order of (possibly) fractional integration \(\hat{d}_T\) is consistently estimated by means of the FLEW. As a semi-parametric test, the statistics require us to estimate parameters in Eq. (5) through ordinary least square under both \(H_0\) and \(H_1\). All parameters in the numerator, including \(\widetilde{\beta }\) and \(\widetilde{\delta }\) are estimated under \(H_1\), whereas \(\hat{\beta }\) and \(\hat{\delta }\) are estimated under \(H_0\). The \(\hat{\gamma }_0\) is the estimated variance of \(u_t\) under the null hypothesis. The parameter \(\lambda ^2\) is obtained from:

$$\begin{aligned} \hat{\lambda }^2=\hat{\gamma _0}+2\sum _{i=1}^qk\left( \frac{j}{q}\right) \hat{\gamma }_i. \end{aligned}$$

where \(\gamma _i\) is the ith autocovariance of \(u_t\), \(k(\cdot )\) is the Bartlett kernel function, and q is the optimal bandwidth selected by Newey and West (1994) optimization. The consistency of \(R_b^f(\hat{d}_T)\) is provided in Mayoral (2012).

As for our initial estimation of the order of integration of the individual series and the relative test for \(\hat{d}=1\) we provide the exact critical values for the \(R_b^f(\hat{d}_T)\) statistic under the \(H_0\) by means of a simulation using a sample size of 64 observations and 10, 000 replications (Table 2).

Table 2 Empirical critical values of \(R_b^f(\hat{d}_T)\) statistic

We also provide evidence on the size and power properties of the test when applied to a small sample like ours. Specifically, we design a small experiment such that under the null hypothesis, the series are integrated of order \(d=0.7\), and are stationary I(\(d=0\)) with a break of magnitude 1 under the alternative. The simulation results, displayed in Table 3, show that the test is correctly sized for truncation parameters \(\alpha \) ranging between 0.65 and 0.80 and is remarkably powerful even in small samples.

Table 3 Size and power of Mayoral’s test

Although the test of Mayoral (2012) is very powerful in distinguishing genuine long memory and unit root processes from spurious ones, unfortunately it is not a change point detection test. This means that it can only identify whether one process is either long memory/non-stationary or a mean changing stationary process but it cannot detect the break location for the ‘spurious’ cases. Hence, we need to complement the above test to uncover the location of any change-points in the series object of our study. To this end, we employ the test developed by Berkes et al. (2006) which is designed to test the following null hypothesis \(H_0\):

$$\begin{aligned} \left\{ \begin{matrix} y_t=\mu +\epsilon _t, \quad 1<t<k^* \\ y_t=\mu ^*+\epsilon _t, \quad k^*+1<t<n \end{matrix}\right. \end{aligned}$$

where \(\mu ^{*}=\mu +\Delta \) and \(\epsilon _t\) is fourth-order stationary process with mean 0, satisfying the following condition:

$$\begin{aligned} n^{-\frac{1}{2}}\sum _{1\le j \le nt}\epsilon _j \overset{d}{\rightarrow } \sigma W(t), \quad \text {in} \quad { D[0,1]}, \end{aligned}$$

where W(t) is a standard Brownian motion. The null of a break in the deterministic term is tested against two different alternative hypotheses. The first alternative \(H^{(1)}_{1}\) is that the order of fractional integration for process \(y_t\) lies between 0 and 0.5, such that in this case \(\epsilon _t\) satisfies the following condition:

$$\begin{aligned} H_1^{(1)}: \quad \frac{1}{n^H}\sum _{1\le j \le nt}\epsilon _j \overset{d}{\rightarrow } c_HW_H(t), \quad \text {in} \quad { D[0,1]}, \end{aligned}$$

where \(c_H>0\) and \(\frac{1}{2}<H<1\). The second alternative \(H_{1}^{(2)}\) implies that the process \(y_t\) contains a unit root, and the innovation process \(\epsilon _t\) satisfies:

$$\begin{aligned} H_1^{(2)}: \quad \frac{1}{\sqrt{n}}\epsilon _{[n,t]} \overset{d}{\rightarrow } \sigma W(t), \quad ({ n \rightarrow \infty }), \end{aligned}$$

We test \(H_{0}\) by means of a non-parametric CUSUM statistics proposed by Berkes et al. (2006) and Aue et al. (2009).

Note that the CUSUM tests of Berkes et al. (2006) and Aue et al. (2009), by assuming a break under the null, are actually reversing the null of Mayoral (2012) thus it will be informative to compare the two sets of results. The main idea in Berkes et al. (2006) and Aue et al. (2009) is to introduce a change-point which occurred at time point \(\hat{k}\), and estimate \(\hat{k}\) by,

$$\begin{aligned} \hat{k}=min\{k:max_{1\le i \le n}|\sum _{1\le j\le i} Y_j-\frac{i}{n}\sum _{1\le j \le n}Y_j|=|\sum _{1 \le j \le k}Y_j-\frac{k}{n}\sum _{1 \le j \le n}Y_j|\}, \end{aligned}$$

As the testing procedure is designed to account for a single change-point, we take the min\(\{ \cdot \}\) of the algorithm allowing us to obtain the location of the strongest change-point. Therefore, the entire sample is split into two sub-samples according to the break point \(\hat{k}\). In each sub-sample, the \(T_n\) statistics are constructed as:

$$\begin{aligned} T_{n,1}= & {} \frac{1}{s_{n,1}}\hat{k}^{-\frac{1}{2}}max_{1\le k \le \hat{k}}|\sum _{1 \le i \le k}Y_i-\frac{k}{\hat{k}}\sum _{1\le i \le \hat{k}}Y_i|, \end{aligned}$$
$$\begin{aligned} T_{n,2}= & {} \frac{1}{s_{n,2}}(n-\hat{k})^{-\frac{1}{2}}max_{\hat{k}< k \le n}|\sum _{\hat{k}< i \le k}Y_i-\frac{k-\hat{k}}{n-\hat{k}}\sum _{\hat{k}< i \le n}Y_i|, \end{aligned}$$

In the above equations, the CUSUM statistics are standardized by the long-run standard deviations \(s_{n,1}\) and \(s_{n,2}\). There are several different ways to estimate the long run variance \(s_n^2\), for example, if \(y_t\) is a dependent series, the \(s_n^2\) can be estimated as:

$$\begin{aligned} s_n^2=\hat{\gamma _0}+2\sum _{1 \le j \le q(n)} \omega _j(q(n))\hat{\gamma }_j. \end{aligned}$$

where \(\hat{\gamma }_j\) is obtained by \(\frac{1}{n}\sum _{1 \le t \le n-j}(Y_t-\bar{Y}_n)(Y_{t+j}-\bar{Y_n})\), \(\omega _j(\cdot )\) is a kernel weighted function, and q(n) is the bandwidth on kernel \(\omega _j(\cdot )\). Andrews (1991) discussed five different types of kernel functions which can be used to calculate heteroskedasticity-autocorrelation consistent (HAC) estimators. We choose the ‘Bartlett’ kernel, following Berkes et al. (2006) and Aue et al. (2009). However, we choose the ‘Bartlett’ kernel weights \(\omega _j(q)=1-\frac{j}{q+1}\), with seven types of bandwidths \(q(\cdot )=Bw\), including \(Bw1=10* \log _{10}(n)\), \(Bw2=15* \log _{10}(n)\), \(Bw3=n^{\frac{1}{4}}\), \(Bw4=\sqrt{\ln (n)}\), \(Bw5=n^{\frac{2}{5}}\), \(Bw6=1.2*n^{\frac{1}{3}}\) and Bw7 that is the Newey-West optimal bandwidth. The first two bandwidths are those used in Berkes et al. (2006), and Bw3–Bw6 in Trapani et al. (2016).

Once the \(T_n\) statistics from the two sub-samples are obtained, we can construct the \(M_n\) statistic as follows:

$$\begin{aligned} M_n=max\{T_{n,1},T_{n,2}\}. \end{aligned}$$

Berkes et al. (2006) derive the following asymptotic distribution of \(M_n\) statistics under the null hypothesis:

$$\begin{aligned} M_n \overset{d}{\rightarrow } max\{sup_{0 \le t \le 1}|B^{(1)}(t)|, sup_{0 \le t \le 1}|B^{(2)}(t)|\}. \end{aligned}$$

They also provide the asymptotic critical values at the 10, 5 and 1% significance levels, which are 1.36, 1.48 and 1.72, respectively (also see Aue et al. 2009). However, these critical values might not be appropriate for our study because the data set used here is not large enough as to approximate the asymptotic distribution. Hence, we compute the exact critical values under the \(H_0\) for a sample of size \(n=64\).

We generate a stationary process \(y_t\) with zero mean and unit variance. In addition, we set the magnitude of the change \(\Delta =0\) so that no change occurs in \(y_t\), and \(\epsilon _t\) satisfies the condition in Eq. (11). For sample sizes \(n=1000\) and \(n=64\) the exact critical values are computed by simulating \(y_t 10{,}000\) times. Table 4 reports the obtained critical values and Table 5 shows the size and power of this test.

Table 4 Empirical critical values for \(M_n\) statistics

To assess the Type I error, we simulate a process which follows Eq. (10) under \(H_0\), setting the break in the mean at the middle of the sample and its magnitude equal to 1. Additionally, the power of the test is measured through generating data under both \(H_1^{(1)}\) and \(H_1^{(2)}\). Thus, there are two versions of the DGP, the first being \((1-L)^{0.4}\cdot y_t=\epsilon _t\) and the other is \((1-L)^{1}\cdot y_t=\epsilon _t\). The \(y_t\) process is a stationary long memory process in the first case and has a unit root in the latter case. The simulation is run with 10, 000 replications for a sample of \(n=1000\) and a small sample of \(n=64\).

Clearly, by using different bandwidths, the performances of Berkes’s test vary extensively. When it comes to a large sample with 1000 observations, the test is always correctly sized. As regards to its power, the test under \(H_1^{(2)}\) generally outperforms its counterpart under \(H_1^{(1)}\), although the power under \(H_1^{(1)}\) can reach \(91.29\%\) at the \(10\%\) significance level once the Bw3 bandwidth is adopted. A similar scenario occurs when the small sample is considered. The test performs better in distinguishing a unit root from a stationary process with change in mean as compared to stationary long memory ones under \(H_0\). Also, because of the small sample, most of bandwidth selections show poor power even though they provide tests which are well or slightly under sized. \(Bw4=\sqrt{ln_n}\) is an exception. Despite of \(n=64\) observations, the power of the test achieves 45.15 and \(89.07\%\) at the \(10\%\) significance level against \(H_1^{(1)}\) and \(H_1^{(2)}\) respectively and it is only slightly oversized. Therefore, we apply Bw4 bandwidth selection for Berkes’ test in our empirical analysis.

Table 5 The size and power of Berkes et al’s test

Data and Empirical Analysis

We collect annual country level per capita \(\hbox {CO}_{2}\) emissions data from the Carbon Dioxide Research CenterFootnote 6. The data ranges between 1950 and 2013, for a total of 64 observations for each country. In our analysis we consider two data sets. Firstly, we collect data on 28 members of the OECD including Austria, Belgium, Canada, Chile, Denmark, Finland, France, Greece, Hungary, Iceland, Ireland, Israel, Italy, Japan, Korea, Luxembourg, Mexico, Netherlands, Norway, New Zealand, Poland, Portugal, Spain, Sweden, Switzerland, Turkey, United Kingdom and United States (we exclude GermanyFootnote 7 and AustraliaFootnote 8). This data set is consistent with the data set used in previous papers (Strazicich and List 2003; Aldy 2006 and Barassi et al. 2011) although with different sample lengthsFootnote 9. For a robustness check and further investigation, we also consider a second enlarged group, obtained by adding 4 of the largest developing economies—BRICS (Brazil, China, India and South AfricaFootnote 10) giving us a total of 32 countries. Notice that we also considered a third group comprising just the 4 BRICS economies.

Note that the relative per capita \(\hbox {CO}_{2}\) emissions series contain, by construction, the cross sectional average calculated within each group. The highly heterogeneous composition of the enlarged OECD \(+\) BRICS group, with its increased difference in terms of the productive structure of the BRICS and their faster growth, is very likely to be characterised by a larger cross sectional variation and thus a very different cross sectional average. This leads us to expect that relative per capita \(\hbox {CO}_{2}\) emissions are more likely to converge within a small and technologically homogeneous group of countries as compared to larger and more heterogeneous ones.

For each group, we use LW-based estimators to estimate the order of fractional integration for the constructed relative per capita \(\hbox {CO}_2\) emissions series \(y_{it}\). The statistics (refer to Eq. 4) will be subsequently used to identify the convergence status of \(y_{it}\). Furthermore, we re-assess the results obtained by testing (Mayoral 2012; Berkes et al. 2006) actual long memory of the relative emissions series (implying strong or infinite persistence) versus spurious integratedness, which would indicate a faster convergence than a long memory process albeit with a change in its deterministics.

The 28 OECD Countries

Figure 1 displays the plot of the relative \(\hbox {CO}_{2}\) emissions for the group comprising the 28 OECD economies, whereas Fig. 2 reports the order of integration of the series estimated by means of LW, ELW, FELW and FELW tests, with de-trending, using a grid of truncation parameters \(\alpha \). Results suggest that the relative per capita \(\hbox {CO}_{2}\) emissions series are covariance non-stationary.

Fig. 1
figure 1

The 28 OECD countries: the relative per capita \({\hbox {CO}}_{{2}}\) emissions. This figure plots the relative per capita \(\hbox {CO}_2\) emissions for each country. It illustrates that these series move toward the cross-sectional mean, which equals zero after de-meaning. The y axis in each sub-plot is the value of per capita \(\hbox {CO}_2\) emissions, and x axis is the time line

Fig. 2
figure 2

The 28 OECD countries: the local Whittle related estimators. This figure plots the estimated order of fractional integration through LW, ELW, FELW and FELW with de-trending under the truncation parameter ranged from 0.65 to 0.8. The y axis in each sub-plot is the value of estimated fractional integration, and x axis is the truncation parameter m

This should not be mistaken by complete absence of convergence as there is strong evidence of mean reversion in the case of Austria, Denmark, Finland, Iceland, Ireland, Israel, Norway and Switzerland and to some extent also for Spain, Portugal, Sweden and Turkey. In the case of Austria, the results are pretty consistent regardless of the test used and the bandwidth \(\alpha \in [0.65, 0.8]\) Footnote 11. In detail, the estimated \(\hat{d}\) ranges between 0.49 and 0.72 (always statistically different from 1) indicating long memory, mean reversion (convergence) but not covariance stationarity. Similar results are reported in the cases of Denmark, when using the ELW estimator, \(\hat{d}\) ranges from 0.54 to 0.63, while for Finland we obtained estimates of \(\hat{d}\) between 0.72 and 0.85 when using either ELW or FELW with \(\alpha \in [0.75,0.8]\). In addition we report that the \(\hat{d}\) takes values between 0.5306 and 0.7503 for Iceland, 0.4617 and 0.8493 for Ireland and 0.5642 and 0.6595 for Israel. In most cases it is the plain ELW estimator that provides the lowest order of integration of the series in this group, this result being consistent across the different values of the bandwidth used. Furthermore, the ELW and FELW with detrending estimates for Norway deliver values for \(\hat{d}\) ranging from 0.63 to 0.86 while for Switzerland the order of integration lies between 0.6 and 0.85 regardless of the estimator used and with \(\alpha \in [0.75, 0.8]\). Weaker evidence of convergence is found in the cases of Spain, with \(\hat{d}\) ranging from 0.71 to 0.75 and for Sweden with \(\hat{d}\) values between 0.79 and 0.82 and Turkey reporting values for \(\hat{d}\) between 0.74 and 0.79. These latter results refer mostly to the ELW test with truncation parameters, \(\alpha \in [0.75,0.8]\).

Note that the remaining series show an order of integration never significantly different from 1, implying covariance non-stationarity and non-reversion to the cross sectional mean indicating no convergence.

These results are consistent with those reported by Barassi et al. (2011), which, using a different group and sample size, also found evidence of very slow convergence. However, they challenge those by Strazicich and List (2003) who found strong evidence of convergence for a sample of 23 OECD countries for a slightly shorter time span.

In the second stage of our analysis we investigate the long or infinite memory of the series in OECD Group controlling for the presence of breaks in the deterministic terms. Table 6 reports the \( R_{b}^{f}(\hat{d}_{T})\) statistics obtained from the de-trended relative per capita \(\hbox {CO}_{2}\) emissions for each country. Critical values are obtained by means of Monte-Carlo simulation for a sample of T=64. The results show that the null hypothesis of relative per capita \(\hbox {CO}_{2}\) emissions being non-stationary cannot be rejected suggesting that no break in mean has occurred in any of the relative \(\hbox {CO}_2\) emissions series for the 28 OECD countries in the sample. This result is partially consistent with Lanne and Liski (2004), who concluded that structural breaks could not explain the declining trend in \(\hbox {CO}_{2}\) emissions, although their analysis only focused on the \(I(0)-I(1)\) dichotomy.

Table 6 The 28 OECD countries: the \(R_b^f(\hat{d}_T)\) statistics for Mayoral’s test

In order to check the reliability of our results, we complement the Mayoral (2012) test with the CUSUM tests by Berkes et al. (2006). Table 7 displays the \(M_n\) statistics for Berkes’ test using the Bartlett kernel with Bw4 bandwidth which has been selected for its superior performance in smaller samples. The null hypothesis, stated in Eq. (10) is that there is a structural change in the deterministics of the \(\hbox {CO}_{2}\) emissions series.

Table 7 The 28 OECD countries: the \(M_n\) statistics for Berkes’ test

Figure 3 shows the \(M_{n}\) statistics for each of the 28 countries. The plot along with the statistics reported in Table 7, show that relative per capita \(\hbox {CO}_{2}\) emissions are either mean-reverting covariance non-stationary or unit root processes over the 64 years considered. These findings are consistent with those obtained by means of Mayoral’s test reported in Table 6.

Fig. 3
figure 3

The 28 OECD countries: the \({{M}}_{{n}}\) statistics for each country. This figure plots the \(M_n\) statistic (the blue line) for each country against time. The red line in each plot is the threshold given by the \(1\%\) critical value. The test rejects the null of stationarity with a break when the red line is crossed. If the \(M_n\) statistics cannot reject the null, then the lowest point in the blue line will be the break location

Table 8 OECD \(+\) BRICS: the \(R_b^f(\hat{d}_T)\) statistics for Mayoral’s test


We now repeat our analysis for an enlarged group obtained by adding to the 28 OECD economies, four of the BRICS: Brazil, China, India, South Africa for a total of 32 countries. Figure 5 shows the estimated order of fractional integration by means of LW, ELW, FELW and FELW with de-trending. The results for this larger sample are now very different from those obtained for the group of the 28 OECDs. We now find very weak evidence of convergence and only for 4 out of the 32 countries, namely Finland, Iceland, Norway and Greece. These results are most likely due to the fact that including four of the five BRICS is causing a significant change in the denominator of the fraction in Eq. (1) and thus a change to the constructed (individual) relative \(\hbox {CO}_{2}\) emissions series which are the object of our interest. Indeed, Brazil, China, India and South Africa have followed very different paths and experienced much faster rates of growth compared to the OECD economies. Thus, if emissions are a by-product of the economic activity of a country, it is plausible to expect \(\hbox {CO}_{2}\) emissions in the BRICS, which are economies largely based on the manufacturing and industrial sectors (which are more emission-intensive), to be of a magnitude and intensity which are very different from those of the rest of the OECD countries. In fact, unsurprisingly, we also observe total lack of convergence for the 4 BRICS both toward the cross sectional average calculated for the group of 32 countries as well as between themselves as highlighted in Table 12.

Table 8 computes the \(R_b^f(\hat{d}_T)\) statistics de-trended per capita \(\hbox {CO}_2\) emissions for each country. Results show that relative per capita \(\hbox {CO}_2\) emissions, in any country, cannot reject the null hypothesis by means of Mayoral’s test. This implies that these series are Integrated and possibly covariance non-stationary without any break in their deterministics (Figs. 4, 5).

In Table 9, we provide the \(M_n\) statistics for Berkes’ test applying the Bw4 bandwidth selection as explained in the methodology section. Recall that the null hypothesis tested here is Eq. 10.

Fig. 4
figure 4

OECD \(+\) BRICS: the relative per capita \({\hbox {CO}}_{{2}}\) emissions. This figure plots the relative per capita \(\hbox {CO}_2\) emissions for each country. It illustrates that these series move toward the cross-sectional mean, which equals to zero after de-meaning. The y axis in each sub-plot is the value of per capita \(\hbox {CO}_2\) emissions, and x axis is time

Fig. 5
figure 5

OECD \(+\) BRICS: the local Whittle related estimators. This figure plots the estimated order of fractional integration through LW, ELW, FELW and FELW with de-trending under the truncation parameter ranged from 0.65 to 0.8. The y axis in each sub-plot is the value of estimated fractional integration, and x axis is the truncation parameter m

Table 9 OECD \(+\) BRICS: the \(M_n\) statistics for Berkes’test

In Fig. 6, we plot the \(M_n\) statistics for each country per year. The results show that relative per capita \(\hbox {CO}_2\) emissions are indeed non-stationary over 64 years, although the significance in cases of Finland and Norway are slightly less pronounced. These findings are consistent with those obtained using Mayoral (2012) test.

Fig. 6
figure 6

OECD \(+\) BRICS: the \({{M}}_{{n}}\) statistics for each country. This figure plots the \(M_n\) statistic (the blue line) for each country against time. The red line in each plot is the threshold given by the \(1\%\) critical value. The test rejects the null of stationarity with a break when the red line is crossed. If the \(M_n\) statistics cannot reject the null, then the lowest point in the blue line will be the break location


This paper examined the long memory dynamics and existence of stochastic convergence in \(\hbox {CO}_{2}\) emissions employing yearly data for the period 1950–2013. Specifically, we adopt a rigorous statistical testing framework designed to detect structural breaks in time series with long memory in order to shed light on the presence of structural breaks as opposed to fractional integration (slow convergence). Furthermore, we provide evidence on whether the slow or non-convergence of \(\hbox {CO}_{2}\) emissions, as partially found in previous studies, is in fact real or it is the result of the occurrence of structural breaks. OECD countries as well as BRICS are analysed. Previous literature on the convergence of \(\hbox {CO}_{2}\) emissions has produced mixed results. We find that all series under investigation are integrated of order \(0<d \le 1\). In more detail, we find that the \(\hbox {CO}_{2}\) emissions in Scandinavian, and “greener” countries such as Austria and Switzerland, show mean reversion and therefore convergence toward the cross sectional mean of the sample comprising the 28 OECD countries, whereas most of the remaining series are I(1) implying lack of convergence in the period under investigation.

These results however do not hold if we add the BRICS to form an enlarged sample of 32 countries as the panel cross sectional mean is now very different (and so are the relative emissions series). Furthermore, we find that long memory and non-stationarity are not a result of structural changes in the deterministics as highlighted by means of structural change tests. Our findings can be usefully exploited by both economic modellers and policy makers in an effort to implement supranational policies aimed to reduce \(\hbox {CO}_{2}\) emissions. Firstly, as highlighted in the previous literature, modelling \(\hbox {CO}_{2}\) emissions using the traditional I(0) versus I(1) dichotomy, may produce misleading results as the fractional integration framework seems to be the appropriate methodology. A non-negligible number of relative \(\hbox {CO}_{2}\) emissions series appear to be long memory processes and thus, shocks even transitory policies will have long lasting if not permanent effects on emissions. Secondly, results provide evidence that the most polluting OECD countries are still not following the path taken by the greener ones where more stringent environmental policies have been and are being pursued. Our findings suggest that a further effort is needed from policy makers in the most polluting countries in order to achieve a fairer distribution of emissions leading to a convergence pattern.