Quantile inference for nonstationary processes with infinite variance innovations

Based on the quantile regression, we extend Koenker and Xiao (2004) and Ling and McAleer (2004)’s works from finite-variance innovations to infinite-variance innovations. A robust t-ratio statistic to test for unit-root and a re-sampling method to approximate the critical values of the t-ratio statistic are proposed in this paper. It is shown that the limit distribution of the statistic is a functional of stable processes and a Brownian bridge. The finite sample studies show that the proposed t-ratio test always performs significantly better than the conventional unit-root tests based on least squares procedure, such as the Augmented Dick Fuller (ADF) and Philliphs-Perron (PP) test, in the sense of power and size when infinitevariance disturbances exist. Also, quantile Kolmogorov-Smirnov (QKS) statistic and quantile Cramer-von Mises (QCM) statistic are considered, but the finite sample studies show that they perform poor in power and size, respectively. An application to the Consumer Price Index for nine countries is also presented.


§1 Introduction
An extensive literature in economics and finance suggests that many economic time series are well characterized as autoregressive processes with a unit root. The augmented Dickey-Fuller (ADF) test, proposed by Fuller (1979, 1981), the Phillips and Perron (PP) (1988) test, and the Kwiatkowski, Phillips, Schmidt and Shin (KPSS) (1992) test are the most frequently used unit root tests, and are vital tools in time series econometrics. The aforementioned test statistics were all derived under the ordinary least squares (OLS) framework. However, as Koenker and Xiao (2004) pointed out, if an innovation distribution deviates from the normal distribution, these tests exhibit poor power performance.
For a (nearly) unit-root process with finite variance innovations, many robust approaches have been proposed to avoid imposing the restrictive normality assumption. For example, Cox and Llatas (1991) developed a test based on M -estimate for an AR(1) process with a (near) unit root; Lucas (1995) considered a unit root test based on a nonparametric modification of M -estimate, focusing on the Huber and Student t procedures; Breitung and Gourieroux (1997) and Hasan and Koenker (1997) proposed rank-type tests within the ADF framework; Koenker and Xiao (2004) proposed a t-ratio test at selected quantiles, a Kolmogorov-Smirnov test and a Cramer-von Mises test in a quantile autoregressive framework; Galvao (2009) extended the tests of Koenker and Xiao (2004) by including stationary covariates in the quantile autoregression.
However, large number empirical studies in macroeconomics and finance, indicate that time series with heavy tails provide better models for such data. For example, Fama (1965) and Mandelbrot (1963Mandelbrot ( , 1967 argued that distributions of commodity and stock returns are often heavy-tailed with possible infinite variance, Rachev and Mittnik (2000) considered stable paretian models in finance, Lux and Marchesi (2000) studied agent-based models and Charemza, Hristova and Burridge (2005) studied the inflation data with heavy tails. More background information on heavy-tailed time series and their applications can be found in Finkenstädt and Rootzén (2003).
Due to its important applications in real data, unit-root tests for cases with infinite variance innovations have attracted more and more attentions in recent years. For example, Chan and Tran (1989) and Rachev, Mittnik and Kim (1998) considered the limit distribution for LSEbased tests, and Knight (1989) developed M and least absolute deviation (LAD) estimate based tests for case with i.i.d. heavy-tailed noise, while Knight (1991) extended these results to the case with infinite-order moving average dynamics; Phillips (1990) generalized the Phillips and Perron's test (1988) to the context of processes driven by weakly dependent shocks whose innovations display infinite variance; Samarakoon and Knight (2009) consider a M -estimate based test for finite-order autoregressive processes driven by infinite variance innovations; Chan and Zhang(2009) study the least squares estimate of the autoregressive coefficient of a nearly nonstationary autoregressive model with strong dependet and infinite variance innovations; Chan and Zhang (2010) consider the quantile estimate and the semi-parametric estimate of the autoregressive parameters with long-and short-range dependent innovations.
It is well known that when the noise is dependent, a more powerful unit-root test is based on the differenced process. A typical test is the augmented Dickey and Fuller (ADF) test. In this paper, the first contribution is that we extend Koenker and Xiao (2004) and Ling and McAleer (2004)'s works from finite-variance innovations to infinite-variance innovations by adopting this idea. More precisely, in this paper, we consider the following model where ∆y t = y t − y t−1 , {u t } is an i.i.d. sequence lying in the domain of attraction of a stable law with tail index α ∈ (0, 2), i.e., there exist two sequences {a n } and {b n } such that where d −→ denotes convergence in distribution, and Z α is a stable variable with tail index α.
For simplicity, we also write such {u t } as u t ∈ N D(α). By (1.1), the unit-root hypothesis is equivalent to test whether δ = 0. The second contribution is that three statistics to test for unit-root and a re-sampling method to approximate the critical values of the t-ratio statistic are proposed. The first statistic is a t-ratio test using a given quantile level, and the other two tests are the quantile Kolmogorov-Smirnov (QKS) and quantile Cramer-von Mises (QCM) tests, which use all the quantiles information in some interval. It is shown that these tests converge to functionals of stable process and a Brownian bridge and perform better than that based on LSE procedure, especially the t-ratio statistics.
The paper is organized as follows. In Section 2, we introduce the model and estimation. In Section 3, the asymptotic results are given for the quantile autoregression parameters. The test statistics and the large sample distributions are presented in Section 4. The finite sample properties using a Monte Carlo simulation are showed in Section 5. A real data example is given in Section 6. The proofs are given in Section 7. Throughout the paper, the symbol " d −→" denotes convergence in distribution, "=⇒" denotes weak convergence on M-topology. §2 Parameters Estimate

Quantile Regression
In this subsection, we consider the estimate for the parameter (δ, ϕ 1 , . . . , ϕ p ) ′ in model (1.1). Let Q u (τ ) be the τ -th quantile of u t and F t be the σ-field generated by {u s , s ≤ t}. The τ -th conditional quantile of ∆y t with respect to F t−1 is given by and the estimator ϕ(τ ) for the ϕ(τ ) can be obtained as follows: where ρ τ (u) = u(τ − I(u < 0)). The above (2.2) is a linear conditional quantile function, and the solution to (2.3) can be obtained using usual linear programming.

Asymptotic properties
Let models (1.1) and (1.2) hold. Under the null hypothesis that y t is a unit-root process (i.e. δ = 0), the convergence rates of the components in ϕ(τ ) are different. Specifically, Q u (τ ) converges at rate √ n, δ converges at rate √ na n , while other components converge at rate a n , where a n = inf{x : P (u t > x) < 1/n} is given as in (1.2). Therefore, we introduce the standardization matrix D n = diag( √ n, √ na n , a n , . . . , a n ), and denote v = D n ( ϕ(τ ) − ϕ(τ )). Then, and minimizing the right-hand side of (2.3) is equivalent to If v minimizes Ω n (v), then v = D n ( ϕ(τ ) − ϕ(τ )) and the original function in (2.3) is represented as a convex objective function of Ω n (v). As a result, we can use the convexity lemma in Knight (1989Knight ( , 1991 and Pollard (1991) to show the asymptotic properties of the function. By the convexity lemma, if the finite-dimensional distributions of Ω n (v) converge weakly to those of Ω(v), and Ω(v) has a unique minimum, then the convexity of Ω n (v) implies that v converges in distribution to the minimizer of Ω(v). By Knight's identity (Knight, 1989(Knight, ,1998, Ω n (v) can be divided into two parts as follows: where ψ τ (u) = τ − I(u ≤ 0). To show the weak convergence of Ω n (v), we need to show the convergence of each part. Once Ω n (v) is shown to converge weakly to Ω(v), then v = D n ( ϕ(τ ) − ϕ(τ )) will converge in distribution to the minimizer of Ω(v).
To show the asymptotic properties, we impose the following assumptions.
Assumption 2. u t has a positive continuous density f (u) on R.

Corollary 1. Under the Assumption of Theorem 1, we have
.
(ii) for the unit-root parameter δ,

§3 Quantile Inference for Unit-root
In this section, we consider the unit root test with the null hypothesis H 0 : δ = 0, and the alternative hypothesis H 1 : δ < 0. To this end, we propose three statistics based on t-ratio, quantile Kolmogorov-Smirnov (QKS) and quantile Cramer-von Mises (QCM).

t-ratio Statistic
Inference based on the quantile regression provides a more robust approach to testing the unit root hypothesis. Based on the asymptotic distribution of δ, we can construct a t-ratio test statistic as follows: Here, Y −1 is the vector of lagged dependent variables (y t−1 ), and P X is the projection matrix onto the space orthogonal to X t = (1, ∆y t−1 , . . . , ∆y t−p ) ′ . In particular,

Theorem 2.
Under the null hypothesis, and using the result for the asumptotic distribution of δ, we have where B(t) is a standard Brownian motion.
We consider the following simulation procedure generating the asymptotic critical values of t n (τ ) similarly to Koenker and Xiao (2004) and Li and Park (2018): 1 Let w t = ∆y t , t = 2, . . . , n, then fit w t by the following qth order autoregression estimate β 1 , . . . , β q by quantile regression (QR), and obtain the residuals u t .
3 Generate y * t under the null restriction of a unit root, 4 Generate random normal vector e t for t = 1, . . . , n.
6 Repeat steps 2 to 5 many times.
Let CV (τ, θ) be the 100θ quantiles (i.e., P ( t(τ ) ≤ CV (τ, θ)) = θ). Then, the unit root hypothesis will be rejected at the To obtain the asymptotic critical values, we let the sample size, n, be 1000. The Monte Carlo simulation is repeated 10,000 times. In order to obtain the feasible t-ratio test, we should obtain consistent estimators for where h n is the bandwidth and F −1 n is the estimator for F −1 (s). In order to select the bandwidth, Koenker and Xiao (2006) suggest two choices. One is the bandwidth rule suggested by Hall and Sheather (1988): The other is Bofinger's (1975) rule: where ϕ(·) and Φ(·) are the density and cumulative distribution function of the standard normal distribution, respectively.

QKS and QCM
In addition to the t-ratio statistic t n (τ ), which uses only a given quantile τ , just as the ADF coefficient test Z δ , we may also use the coefficient-based statistic in the QAR model. Define the coefficient-based statistic U n (τ ) = √ na n δ.
By Corollary 1, under the unit-root hypothesis and Assumptions 1-3, Let τ ∈ T = [τ 0 , 1 − τ 0 ], for small τ 0 > 0. We define the QKS and QCM tests similarly to Koenker and Xiao (2004) and Galvao (2009) as follows: Theorem 3. Suppose that δ = 0 and Assumptions 1-3 hold, then In practice, we may calculate U n (τ ) at {τ i = i/n} n i=1 , and thus the statistics QKS can be constructed by taking maximum over τ i ∈ T and QCM are obtained using numerical integration.
The critical values of the QKS and QCM tests can be obtained by resample method, which was used above to calculating critical value of t n (τ ). §4 Finite sample evaluation To study the finite sample properties of the proposed tests, we consider the following design, which is the leading case studied in the literature:  7) the t-ratio test t n (τ ) based on QAR at τ = 0.5, the empirical size and the empirical power are given in Table 1-4.  Table 1 reports the size performance of the tests. First, the ADF and PP tests show severe size distortion for all sample sizes and all distributions. Second, the VRT and MZ ϕ tests show good size performance, similar to that of the t n (0.5) test. In addition, when the sample size is n = 50, the QKS test has severe size distortion, while, it has good size performance when n = 200. Finally, the size performance of the QKS test is better than that of the QCM test. Moreover, as the sample size increases, the size performances of the QKS and QCM tests improve. Table 2-4 give the empirical power of the tests. First, the ADF and P P tests have low power for different ϕ. second, the power performance of the VRT test is poor but improves with the sample size increasing. In addition, the MZ ϕ , QCM and t n (0.5) tests show good performance. However, the power of QKS test is poor, the possible reason is that when τ tends to 0 or 1, the sample used to calculate U n (τ ) could be quite small, which may lead to huge error in calculating the critical values of QKS. Table 1. Size of the tests (ϕ = 1).  Figure 1(b) shows that QKS has a good size performance, but performs poor in power. Figure 2 shows that QCM has severe size distortion, but performs well in power. §5 Real example

ADF
In this section, we study the unit root properties of the monthly Consumer Price Index (CPI) for several countries. The series vary in length from 241 to 723 observations, and cover various       Figure 3. The distribution of CPI (defined as the first difference of the logarithm of prices) are often skewed and leptocurtic, differing markedly in shape from the normal, therefore, an alternative assumption about the behaviour of the innovations is that they are draws from a symmetric stable Paretian distribution. Table 5 gives the test results for a unit root using t n (0.5), QKS and QCM. The columns of 3, 5 and 7 are the values of test statistics t n (0.5), QKS and QCM, and the column of 2, 4 and 6 are the critical values at the 5% significance level, except for the row corresponding to Australia, which are the critical values at the 10% significance level. Table  5 shows the CPI for Japan, China, UK, Euro, Mexico, Brazil, Korea and U.S. are unit root processes for different test statistics. These critical values are calculated under the unit root null using the resampling procedure described in Section 4.1. While, for the CPI of Australia, the unit root hypothesis is marginally rejected by the KS-type and CM-type tests at the 10% level, but not rejected by t n (0.5), while the unit root hypothesis for the CPI of Australia shouldn't be rejected from the Figure 3. This also indicates that the t n test performs better in testing unit-root. Before the proof of Theorem 1, we first give a weak convergence result need to be used.

Lemma 1. Under the assumptions of Theorem 1, we have
where η α/2 is a stable variable with tail index α/2. Further, ( 1 a n n ∑ t=1 ∆y t−1 ψ τ (u tτ ), . . . , 1 a n n ∑ t=1 ∆y t−p ψ τ (u tτ ), y s a n , 1 √ n . We will only give the proof of (i) in the following. Note that ψ τ (u tτ ) are independent with ∆y t−i , (i = 1, . . . , p), and nP (|∆y t | > a n ) ∼ 1, then nP (|∆y t−i ψ τ (u tτ )| > a n ) ∼ E|ψ τ (u tτ )| α . (6.1) And following Resnick (1986), it holds (6.2) where N is a Poisson random process. When 0 < α < 1, by Theorem 3.1 of Davis and Hsing (1995), (6.1) and (6.2), we obtain 1 a n n ∑ t=1 Next, we consider the case α ≥ 1. Denote S n = 1 and Also, where S (M ) is a stable process with index α. Together with (6.8), it holds that 1 a n n ∑ t=1 The proof of Theorem 1. By Lemma 1, it follows that We now consider the limit of . First, we consider the limit of To avoid technical problems in taking conditional expectations, following Knight (1989), we consider truncation of v 2 a −1 n y t−1 at some finite number m < 0 and define Noting thatŪ Under Assumption 2, It follows thatŪ In addition, (v ′ D −1 n x t )I(m < v 2 a −1 n y t−1 < 0) Therefore, the limiting distribution of ∑ n t=1 ξ tm (v) is the same as that of ∑ n t=1ξ tm (v), that is, For any small number ϵ > 0, Therefore, as m → −∞, Similarly, we can show As a result, The proof of Corollary 1. By Theorem 1 and (2.4), it holds that By the convexity Lemma of Pollard (1991) and arguments of Knight (1989), note that Ω n (v) and Ω(v) are minimized at v = D n ( ϕ(τ ) − ϕ(τ )) and Furthermore, which also implies (2.5).
Theorems 2-3 follows easily from Theorem 1, Lemma 1 and the continuous mapping theorem. We omit the details here.