Panel stationary tests against changes in persistence

Abstract In this paper we propose new panel tests to detect changes in persistence. The test statistics are used to test the null hypothesis of stationarity against the alternative of a change in persistence from I(0) to I(1), from I(1) to I(0), and in an unknown direction. The limiting null distributions of the tests are derived and evaluated in small samples by means of Monte Carlo simulations. An empirical illustration is also provided.


Introduction
Over the last two decades, a vast literature has investigated whether economic and financial time series may be characterized by a change in persistence between separate I(1) and I(0) * Previous versions of the paper were presented at the First International Conference in Memory of Carlo Giannini, "Recent Development in Econometric Methodology", at University of Bergamo, and at seminars at University of Vienna and University of Leicester. The authors would like to thank seminar and conference participants, and in particular David Hendry, Chihwa Kao, Oliver Linton, Robert Kunst, Stephen Pollock, Wojciech Charemza, Panicos Demetriades, Qiang Zhang, Francesco Moscone, and two anonymous referees for many useful comments and suggestions. Westerlund would like to thank the Knut and Alice Wallenberg Foundation for financial support through a Wallenberg Academy Fellowship, and the Jan Wallander and Tom Hedelius Foundation for financial support under research grant number P2014-0112:1. regimes rather than simply I(1) or I(0) behavior. Changes of this kind in macroeconomic variables are well documented; see the literature reviews in Kim (2000) and Leybourne et al. (2003). A non-exhaustive list of the variables for which such phenomena have been observed includes inflation, real output, budgetary deficits, interest rates and exchange rates.
Interestingly, while many data sets are in fact panels of multiple time series, the way that existing tests are constructed requires that the series are tested one at a time. This is wasteful in the sense that each time a test is carried out the information contained in the other series is effectively ignored. The current paper can be seen as a reaction to this. The purpose is to develop tests for changes in persistence that explores the multiplicity of series, and that can be seen as panel extensions of the time series tests of Kim (2000), Kim et al. (2002), and Busetti and Taylor (2004). The tests can be used to flexibly test the null hypothesis of stationarity against the alternative of a change in persistence not only from I(0) to I(1), and from I(1) to I(0), but also when the direction is unknown. The data generating process (DGP) considered is quite general. Some of the allowances are unit-specific constant and trend terms, cross-section heteroskedasticity, error serial correlation and cross-section dependence in the form of common factors. The asymptotic distributions of the tests are derived and evaluated in small samples using Monte Carlo simulation. An empirical illustration is also provided showing how how inflation of 20 developed countries has undergone a shift from I(0) to I(1).
The rest of the paper is organized as follows. Sections 2 and 3 present the model, the test statistics, and their asymptotic distributions, which are evaluated using simulations in Section 4. Section 5 reports the results from the empirical application. Section 6 concludes.

Model and assumptions
Consider the panel data variable Y i,t , where i = 1, ..., N and t = 1, ..., T index the time-series and cross-sectional units, respectively. The DGP of this variable is given by where D t,p = (1, t, ..., t p ) is a p-order trend polynomial such that D t,p = 0 is p = −1, F t is an r × 1 vector of common factors with λ i being the corresponding vector of factor loadings, and ε i,t is a mean zero and I(0) error term. The following three specifications of µ i,t are considered, where 1(A), x , η i,t and τ 0 i ∈ [0, 1] denote the indicator function of the event A, the integer part of x, a mean zero I(0) error term, and the break fraction, respectively: MU3. Unknown direction: Under MU1 Y i,t is I(0) up to and including time τ 0 i T but is I(1) after the break, provided that σ 2 η,i = var(η i,t ) > 0. Under MU2 Y i,t is I(1) up to and including time τ 0 i T but it is I(0) after the break, provided again that σ 2 η,i > 0. Therefore, the hypothesis of stationarity against a shift in persistence from I(0) to I(1) or viceversa can be stated as H 0 : σ 2 η,1 = ... = σ 2 η,N = 0 versus H 1 : σ 2 η,i > 0 for at least some i. Whenever the alternative is I(1) → I(0) we write "H 1 : I(1) → I(0)", whereas if the alternative is I(1) → I(0), we write "H 1 : I(1) → I(0)".
The conditions placed on the above DGP are given in Assumption 1, where C < ∞, tr(A), ||A|| = tr(A A), → p and F i,t denote a generic positive constant, the trace and Euclidean norm of the (generic) matrix A, convergence in probability, and the sigma-field generated by {(ε i,n , η i,n )} t n=1 , respectively. Assumption 1.
(iv) ε i,t , η i,t and F t are mutually independent; (v) µ 1,0 = ... = µ N,0 = 0; Remark 1. Assumption 1 puts restrictions on the time series and cross-sectional properties of ε i,t and η i,t . The restrictions are very similar to the ones of Bai and Ng (2004), and we therefore refer to this other paper for a detailed discussion. The main difference when compared to Bai and Ng (2004) is that here F t cannot be I(1). Thus, while Y i,t may be cross-correlated, it cannot be affected by common stochastic trends. However, we would like to point out that this assumption is mainly for ease of interpretation of the test outcome, for if F t is allowed to be I(1) the persistence of Y i,t cannot be inferred from e i,t alone, and in the present paper we focus on the testing of e i,t . Hence, analogous to the PANIC approach of Bai and Ng (2004), if F t is permitted to be I(1), then we also need to test this variable.

The test statistics
The general testing idea is to first purge the effect of F t , and then to submit the resulting residuals to a test for a change in persistence. The implementation of the first step depends on whether F t is known or not.

F t known
Consider the generic variable X i,t . The detrended version of this variables is henceforth denoted X p i,t = X i,t − ∑ T n=1 X i,n a n,t,p , where a n,k,p = D n,p (∑ T t=1 D t,p D t,p ) −1 D k,p and p ≥ 0. If p = −1, then we define X p i,t = X i,t . In this notation, the detrended and defactored version of Y i,t is given byê i,t = Y p i,t −λ i F p t , whereλ i is the least squares (LS) slope estimator in a regression of Y p i,t onto F p t . Thus, while in this section F t is assumed to be known, λ i is still treated as unknown. Consider the following test statistic, which is suitable for testing if cross-section unit i is I(0) versus I(1) → I(0) (see, for example, Kim, 2000;Kim et al., 2002;Busetti and Taylor, 2004): where τ ∈ [0, 1], S 0 i,t (τ) = ∑ t n=1êi,n and S 1 i,t (τ) = ∑ t n= Tτ +1ê i,n . The error sequences {ê i,n } Tτ n=1 and {ê i,n } T n= Tτ +1 come from two separate regressions; while the former uses only the first Tτ observations, the latter uses only the last T(1 − τ) observations. Remark 2. The K i,T (τ) test considered here is in the spirit of Kwiatkowski et al. (1992) in which the constant I(0) null is tested versus the constant I(1) alternative. An alternative approach is to follow Banerjee et al. (1992) and Leybourne et al. (2003) who use the Dickey-Fuller statistic, in which the null and the alternative hypotheses are reversed. Panel variants of these can be constructed in the same way as the one suggested below for K i,T (τ) (see Demetrescu and Hanck, 2013, for such a proposal).
Let C = [τ min , τ max ] ⊆ (0, 1). In this paper, we consider three transformations to eliminate the dependence on τ in K i,T (τ) (see, for example, Kim, 2000); T1. The maximum-Chow transformation: T2. The mean-exponential transformation: T3. The mean score transformation: In Appendix (Proof of Theorem 1), we show that K i,T (τ) → w K i (τ) as T → ∞, where → w signifies weak convergence and K i (τ) is a certain ratio of stochastic integrals. Since . Numerical values of µ K,j and σ K,j are reported in Table 1. The proposed panel test statistic for testing H 0 versus H 1 : I(0) → I(1) is given by For testing if cross-section unit i is I(0) versus I(1) → I(0), the following "reverse" test statistic can be used (see Kim, 2000;Kim et al., 2002;Busetti and Taylor, 2004): which can be transformed using T1-T3 to eliminate the dependence on τ. The resulting transformed statistic is written in an obvious notation as R j i . Based on this test statistic, we may define with obvious definitions of σ 2 R,j and µ R,j . When the direction of the persistency is unknown, the following maximum statistic may be used: Theorem 1. Under H 0 and Assumption 1, as N, where → d signifies convergence in distribution.

Remark 3.
While the test statistics considered here are independent of τ 0 1 , ..., τ 0 N , in applications it is sometimes useful to be able to estimate these parameters. This can be accomplished using the proposal of Kim (2000, Section 3.2), which basically amounts to settingτ 0 i equal to the suitably maximizing or minimizing value of K i,T (τ), depending on whether it is I(0) → I(1) or I(1) → I(0) that is being tested. Alternatively, we may follow Busetti and Taylor (2004, Section 6.2), who suggest settingτ 0 i equal to the value of τ 0 i that minimizes the sum of squares ofê i,t .
Remark 4. The requirement that N/T → 0 is sufficient but not necessary and is needed to make sure that certain remainder terms are negligible. However, the order of these terms is not the sharpest possible. A more elaborate asymptotic analysis would be required to obtain the exact order. In Section 4, we use Monte Carlo simulation to evaluate the effect of N/T in small samples.

F t unknown
The estimation of F t can be performed in two ways; (i) unrestrictedly, or (ii) restricted under H 0 . In both cases, we follow the bulk of the previous literature and use the principal components method (see, for example, Bai and Ng, 2004). The restricted estimator of The restricted estimator of e i,t that we will be considering can now be constructed aŝ Let X p−1 i,t be X i,t when detrended using a trend polynomial of order p − 1. Hence, As Theorem 2 makes clear, the factors can be unknown and still the asymptotic distributions of the test statistics are N(0, 1). This is in agreement with the results reported by Bai and Ng (2004) for their pooled panel unit root tests.

Monte Carlo simulations
A small-scale Monte Carlo study was conducted to investigate the properties of the new tests in small samples. The DGP is given by a restricted version of (1)-(2) that sets where v t ∼ N(0, 1) and ρ ∈ {0.3, 0.6} (see, for example, Gengenbach et al., 2010, for a similar parametrization). For σ ε,i , we consider two cases. In the first, σ ε,i = 1 for all i, while in the second, σ ε,i ∼ U(1, 2). Since a more volatile idiosyncratic error will make F t more difficult to discern, we expect that the results for the second case will deteriorate when compared to the first. All results are based on 1,000 replications of samples of size N ∈ {5, 10, 20} and T ∈ {50, 100}. Also, following Kim (2000), C = [0.20, 0.80]. Results were obtained for p ∈ {0, 1}, although in this paper we focus on the results for the empirically most common specification with p = 0 (a constant but no trend). The results for p = 1 (constant and trend) can be obtained upon request. Both the restricted and unrestricted factor estimation methods were simulated. Interestingly, the restricted method led to better results in terms of both size accuracy and power. In this paper, we therefore only report the results for the restricted method, where the number of common factors is determined using the IC 2 criterion of Bai and Ng (2002)  contain the corresponding results for I(1) → I(0). The information content of these tables may be summarized as follows.
• All tests have good size accuracy when σ ε,i = 1 and ρ = 0.3. This is true for all constellations of T and N considered, although the distortions do have a tendency to increase slightly in N, which is consistent with the previous panel unit root literature (see Westerlund and Breitung, 2013, for a discussion). While there are no big differences, the best size accuracy is generally obtained by using K 2 NT , R 2 NT and M 2 NT , whereas K 3 NT , R 1 NT and R 3 NT generally leads to the worst accuracy.
• As expected, increases in ρ and/or σ ε,i generally lead to reduced size accuracy, although the distortions are never very large. This is true regardless of the direction of the change in persistence. In fact, the results are remarkably stable, given that the test statistics do not require any corrections to account for nuisance parameters.
• All tests perform quite well in terms of power, and there are clear improvements as N and/or T increases. The fact that power is not only increasing in T, but also in N illustrates the advantage of accounting for the cross-sectional variation of the data.
Power is also increasing in the distance to the null, as measured by σ η , which is again just as expected.

Empirical illustration
The question of whether inflation should be considered as I (0)  The number of common factors is determined in the same way as in the simulations. As is customary when dealing with inflation (see, for example, Leybourne et al., 2003), the tests are fitted with a constant but no trend. The results are reported in Table 6. The first thing to note is that while in case of K 1 NT , K 2 NT and K 3 NT there is no evidence against the I(0) null, R 1 NT , R 2 NT and R 3 NT all lead to a clear rejection. This is true even at the most conservative 1% level. We therefore conclude that inflation has been subject to a change in persistence from I(1) to I(0), which is in agreement with the recent empirical literature based on US data (see, for example, Busetti and Taylor, 2004;Harvey et al., 2006). A common explanation for the observed change in persistence of inflation in the US is that it is due to the stock market collapse of the late 1980's and the recession that followed it. One interpretation of the results reported in the current paper is therefore that they reflect the worldwide recession of the early 1990's, which was to a large extent triggered by the recession in the US. Another possibility is that the results reflect in part monetary policy shifts (see, for example, Davig and Doh, 2014, and the references provided therein).

Conclusion
This paper develops panel tests that are suitable for testing the null hypothesis of stationarity against the alternative of a change in persistence from I(0) to I(1), from I(1) to I(0), or when the direction is unknown. The DGP used for this purpose is quite general and allows unitspecific constant and trend terms, cross-section heteroskedasticity, error serial correlation and cross-section dependence in the form of common factors.
Westerlund, J., and S. Mishra (2016). On the determination of the number of factors using information criteria with data-driven penalty. Forthcoming in Statistical Papers.

Appendix: Proofs
The proofs of Theorems 1 and 2 are established for K j NT ; the proofs for R j NT and M j NT are entirely analogous.

Proof of Theorem 1.
Under MU1, µ i,t = ∑ t k=1 1(k > Tτ )η i,k , and by further invoking H 0 , µ i,t = 0, giving It follows that with obvious definitions of F p t and ε p i,t , which in turn implieŝ Therefore, Under H 0 and with F t known Y i,t = θ i D t + λ i F t + ε i,t is just an ordinary time series regression in I(0) variables with exogenous regressors. It follows that √ T(λ i − λ i ) = O p (1), and therefore, since T −1/2 ∑ t n=1 F p n = O p (1), Hence, using K i,T (τ) to denote K i,T (τ) withê i,n replaces by e i,n , we have where the first term on the right is the same as in Harvey et al. (2006). It follows from their results that as T → ∞, where → w signifies weak convergence, and with W ε,i (r) being a standard Brownian motion, and D p (r) is such that Q −1 T D Tr ,p → D p (r), where Q T = diag(1, T, ..., T p ). Note in particular how D 0 (r) = 1 and D 1 (r) = (1, r) . Therefore, by the continuous mapping theorem, and writing K j i,T = H j (K i,T (τ)) and K j i,T = H j (K i,T (τ)) as in Busetti and Taylor (2004), Let us now consider K j NT . By using the previous result where O p ( √ NT −1/2 ) = o p (1) under our assumption that N/T = o(1). We now use the same steps as in Moon and Phillips (2000, page 994) to verify that (K j i,T − µ K,j ) satisfies conditions (i)-(iv) of the central limit theorem of Phillips and Moon (1999, Theorem 2). In so doing we follow their notation and write Q i,T = (K j i,T − µ K,j ), which is iid with mean zero and variance σ 2 K,j ≤ C. We have already shown that K This verifies conditions (i), (ii) and (iv). Condition (iv) follows from noting that, by the continuous mapping theorem, Q 2 i,T → w Q 2 i . It follows that as N, T → ∞ with N/T → 0.

Proof of Theorem 2.
We begin by considering the case when the estimator of e i,t is based on the restricted estimators of λ i and F t under H 0 . As in Proof of Theorem 1, under MU1 and H 0 , In order to capture the fact that λ i and F t are not separately identifiable we introduce the r × r rotation matrix H such that Hence, By Lemmas 1 (c) and 2 of Bai and Ng (2004) where the latter result holds uniformly in t. Hence, since we can show that Hence, as in the case when F t is known (see Proof of Theorem 1), the estimation and removal of the common component do not affect the asymptotic distribution of the test statistic. Specifically, using K j 0i,T to denote K j i,T withê 0 i,n in place ofê i,n , we get which holds uniformly in (j, i). In order to show that the resulting panel statistic, K j 0NT say, converges to N(0, 1), we may use the same argument as in Westerlund and Larsson (2009).
Consider the unrestricted estimator of e i,t . We haveẽ 1 ). From Proof of Theorem 3 in Bai (2003), using V to denote a diagonal matrix consisting of the first r eigenvalues of (NT) −1 y p−1 (y p−1 ) in decreasing order, (2003), By using this andF 1 suggesting that for p ≥ 0, When appropriately normalized by T −1/2 , taking partial sums do not affect the order of the remainder terms. Hence, again, the estimation and removal of the common component do not affect the asymptotic distribution of the test statistic. Standard deviation, p = 0 (constant) 16 Table 2: 5% size and power when testing I(0) → I(1) and ρ = 0.3. Notes: σ η and σ ε,i refer to the standard deviation of η i,t and ε i,t , respectively, while ρ refers to the autoregressive coefficient of F t . The results are based on setting p = 0 (constant) and using the restricted factor estimation method, which assumes that the null hypothesis is true. Table 3: 5% size and power when testing I(0) → I(1) and ρ = 0.6.  Table 2 for an explanation.   Table 2 for an explanation.  Table 2 for an explanation. Notes: ***, ** and * denote significance at the 1%, 5% and 10% levels, respectively. While the restricted factor estimation method assumes that the null hypothesis is true, the unrestricted method does not.