Abstract
The inferential procedures discussed in previous chapters are based on asymptotic considerations in the sense that they rely on the convergence of the distribution of the test statistics to some known limit distribution as the sample size goes to infinity. However, in order to work well, first-order asymptotic approximation requires that the asymptotic distribution is an accurate approximation to the finite sample distribution. When dealing with cointegrated VAR models, this is not generally the case. In this chapter we investigate the performance of various small sample inference procedures for cointegrated vector autoregressive models. Special attention is given to the Bartlett(1937) and the bootstrap Bartlett adjustment for the likelihood ratio test. The bootstrap p-value test and an F-type test are also considered. Throughout the chapter performance is assessed in terms of the empirical sizes and power properties of the inference procedures under consideration. An empirical application is also provided to illustrate the use of these procedures with real data. The analysis should provide some guidance to practitioners in doubt about which inference procedure to use when dealing with cointegrated VAR models.
6.1 Introduction
The inferential procedures discussed in previous chapters are based on asymptotic considerations in the sense that they rely on the convergence of the distribution of the test statistics to some known limit distribution as the sample size goes to infinity. However, in order to work well, first-order asymptotic approximation requires that the asymptotic distribution is an accurate approximation to the finite sample distribution. When dealing with cointegrated VAR models, this is not generally the case. In this chapter we investigate the performance of various small sample inference procedures for cointegrated vector autoregressive models. Special attention is given to the Bartlett (1937) and the bootstrap Bartlett adjustment for the likelihood ratio test. The bootstrap p-value test and an F-type test are also considered. Throughout the chapter performance is assessed in terms of the empirical sizes and power properties of the inference procedures under consideration. An empirical application is also provided to illustrate the use of these procedures with real data. The analysis should provide some guidance to practitioners in doubt about which inference procedure to use when dealing with cointegrated VAR models.
6.2 Testing for Cointegrating Rank in Finite Samples
Consider the n-dimensional VAR model
where X t , \(\varepsilon _{t} \sim \left (0,\Omega \right )\) are (n × 1) vectors with \(E\left (\varepsilon _{t}\varepsilon _{s}\right ) = 0\) (for t ≠ s) and \(\Delta X_{t} = X_{t} - X_{t-1}\). The matrices of coefficients have the following dimensions: α and β are \(\left (n \times r\right )\); ϕ is \(\left (n \times n_{d}\right )\); ρ is \(\left (n_{d} \times r\right );\) and \(\Gamma _{1},\ldots,\) \(\Gamma _{k-1}\) are \(\left (n \times n\right )\). Also, \(d_{t}\left (n_{d} \times 1\right )\) and \(D_{t}\left (n_{D} \times 1\right )\) are deterministic terms in (6.1).
As seen in Chap. 4 the trace test for testing the null of cointegrating rank has the form
where \(1>\hat{\lambda } _{1}> \cdots>\hat{\lambda } _{n}> 0\), \(\hat{\lambda }_{n+1} = 0\) solve the eigenvalue problem in (4. 77).
The trace test converges in distribution to
where W is an \(\left (n - r\right )\) dimensional Brownian motion on the unit interval, and F is a demeaned and detrended Brownian motion (see Appendix E.4). However, many studies have reported that the finite sample properties of the trace are different from the asymptotic properties; see for example Cheung and Lai (1993) or Haug (1996) among many others. Broadly speaking, the problem can be described as one of lacking coherence between the test statistic and its reference distribution. One way of addressing this problem is to correct the test statistic so that the finite sample distribution is closer to the asymptotic distribution. Early attempts of correcting the test statistic were made in Ahn and Reinsel (1990) and Reimers (1992), where small sample corrections based on degrees of freedom were suggested. More recently, Johansen (2000) proposed a Bartlett-type correction factor for Q.
The Bartlett correction is based on a simple idea, but it can be very effective in reducing the finite sample size distortion problem of the likelihood ratio (LR) tests. Roughly speaking, this method takes the form of a correction to the mean of the LR statistic for a given parameter point θ. In regular cases, the asymptotic distribution of the LR statistic is given by \(\mathrm{LR} = -2\log (\mathrm{LR}) \sim \chi ^{2}\left (q\right )\) where q is the dimension of the constraints, and the asymptotic mean of the LR statistic ought to be approximately equal to q. The Bartlett correction is intended to make the mean exactly equal to q by replacing the above equation by \(\mathrm{LR}_{B} = \frac{q\left (\mathrm{LR}\right )} {E_{\theta }\left (\mathrm{LR}\right )}\) and then referring the resulting statistic to a \(\chi ^{2}\left (q\right )\). Typically, given the complicated form of the LR test, it is difficult to find an exact expression for \(E_{\theta }\left (\mathrm{LR}\right )\), one can instead find an approximation of the form
Thus, the quantity
has an expectation q+ \(O\left (T^{-3/2}\right )\) which is closer to the limit distribution; see for example Cribari-Neto and Cordeiro (1996) for an excellent survey on this type of correction.
As an alternative, for a given test statistic one could correct the reference distribution. This second route involves replacing the critical values of the limit distribution with transformations of critical values obtained from the Edgeworth expansions of the distribution function. In the literature it has been shown that in many cases the bootstrap can be considered as a numerical approximation to analytical calculations of one-term Edgeworth expansions; see Hall (1992). For cointegrated VAR models, due to the intricate analysis, the ability of the bootstrap test to provide higher order refinement is still an open question. An important breakthrough in this literature is given in Park (2003) where asymptotic expansions for the unit root models are developed and it is shown that the bootstrap test provides asymptotic refinements for the Dickey-Fuller tests. Considering that the trace test is just the multivariate extension of the (augmented) Dickey-Fuller tests, Park’s results are quite promising. In practice, however, whether a particular asymptotically valid technique is useful in small samples can only be evaluated by simulation. Thanks to the increase in the power of computers, the number of studies evaluating the usefulness and the limitations of bootstrap inference in cointegrating systems is growing rapidly; see for example van Giersbergen (1996), Harris and Judge (1998), and Mantalos and Shukur (2001).
In the next section we consider Johansen (2002) Bartlett-type correction factor for the trace test and the p-value bootstrap test.
6.2.1 Bartlett Correction Factor for the Trace Test
In order to review the Johansen’s Bartlett adjustment it is convenient to rearrange the stochastic vector \(\left (X^{{\prime}}\beta,\Delta X_{t-1}^{{\prime}},\ldots,\Delta X_{t-p+1}\right )\) as an AR(1) process
with
or more concisely
Let θ denote the parameters of the model (6.1) under the assumption that \(\Pi =\alpha \beta ^{{\prime}}\) and \(\Upsilon =\alpha \rho ^{{\prime}}\) and \(\Sigma\) the variance of Y. Calculating the Bartlett correction factor requires calculating an asymptotic expansion for the expected value of the test statistic. Johansen derived an approximation to \(E_{\theta }[-2\log \mathrm{LR}(\Pi =\alpha \beta ^{{\prime}},\) \(\Upsilon =\alpha \rho ^{{\prime}})]\) in the form
where n b = n − r, and suggested using the corrected statistic
where \(f\left (n_{b},n_{d}\right ) =\mathop{\lim }\limits_{ T \rightarrow \infty }f\left (T,n_{b},n_{d}\right )\) and \(a(T,n_{b},n_{d}) = f\left (T,n_{b},n_{d}\right )/\) \(f\left (n_{b},\text{ }n_{d}\right ).\) The correction factor for the trace test in (6.1) is given by
An analytic expression for \(b\left (\theta \right )\) is of the form
where
with
In order to implement the correction factor (6.2) the coefficients f(T, n b , n d ), a(T, n b , n d ) and g(T, n b , n d ) have to be calculated. These are a complicated function of a random walk, therefore Johansen tabulates them by simulations; the interested reader is referred to Johansen (2002) for more details.
For practical purposes, it is important to note that the adjustment in (6.2) is derived under the assumption of Gaussian innovations. When innovations are non-Gaussian the second terms of the asymptotic expansions of the mean and the variance of Q depend on the skewness and kurtosis of their distribution. This implies that in order to use the analytical Bartlett’s correction factor it is necessary to estimate the skewness and kurtosis of the true distribution and accordingly modify the Bartlett’s adjustment. In the case of cointegrated VAR models this can be rather demanding. In this respect, the non-parametric bootstrap method does not require a choice of error distribution and may be more convenient to use when the innovations are non-normally distributed.
6.2.2 The Bootstrap p-Value Test
The bootstrap method can be used to approximate to a finite sample distribution of the trace test under the null hypothesis. The basic principle is to use the bootstrap values of the LR statistic to approximate the p-value of the observed value of the test statistic. In short, this is done by generating a large number of repeated samples (resamples), calculating the bootstrap analogue of the statistics of interest in each resample, and using the latter to compute the bootstrap p-value function. The bootstrap p-value is then compared with the desired null rejection probability. Under the null hypothesis of non-zero rank, the steps used to implement the non-parametric bootstrap can be summarized as follows:
-
Step (1): Generate B bootstrap residuals ɛ t ∗, t = 1, …, T, as independent draws with replacements from the centred residuals
$$\displaystyle{ \left \{\hat{\varepsilon }_{t} - T^{-1}\sum \limits _{ i=1}^{T}\hat{\varepsilon }_{ i}\right \}_{t=1}^{T} }$$ -
Step (2): Construct the bootstrap sample recursively from
$$\displaystyle{ \Delta X_{t}^{{\ast}} =\hat{\alpha }\beta ^{{\prime}}\left (X_{ t-1}^{{\ast}} +\hat{\rho } ^{{\prime}}D_{ t}\right ) +\sum \limits _{ i=1}^{k-1}\hat{\Gamma }_{ i}\Delta X_{t-i}^{{\ast}} +\hat{\phi } d_{ t} +\varepsilon _{ t}^{{\ast}} }$$ -
Step (3): Compute Q j ∗ using the data of step \(\left (2\right )\) and repeat B times.
-
Step (4): The bootstrap p-value of the observed value \(\Lambda\) is estimated by
$$\displaystyle{ \hat{P}^{{\ast}}(Q) = B^{-1}\sum \limits _{ j=1}^{B}I\left (Q_{ j}^{{\ast}}\geq Q\right ) }$$where I(⋅ ) is the indicator function that equals one if the inequality is satisfied and zero otherwise. The bootstrap p-value test, Q ∗, is carried out by comparing \(\hat{P}^{{\ast}}(Q)\) with the desired critical level and rejecting the null hypothesis if Q j ∗ is not greater than Q.
Before concluding this section a caveat regarding the use of the bootstrap algorithm above is discussed. When innovations are independent and identically distributed with common variance it is possible to obtain accurate inference by simply resampling the residuals of the estimated model in (6.1) without needing to assume a particular parametric assumption about the distribution of the innovations. Swensen (2006) considered a recursive bootstrap algorithm for testing the rank of \(\Pi =\alpha \beta ^{{\prime}}\) in (6.1) and proved that, under a variety of regularity conditions, the non-parametric bootstrap based test is consistent in the sense that the bootstrap statistic converges weakly in probability to the correct asymptotic distribution. However, when the innovations are not identically and independently distributed (IID), simply resampling from the residual fails to mimic essential features of the data generating process (DGP). In the literature several bootstrap procedures have been proposed to overcome this problem. For example, a suitable modification of the residual-based bootstrap procedure for conditional heteroscedastic innovations is the wild bootstrap which is designed to accommodate the possibility of independent but not identically distributed innovations; see Cavaliere et al. (2010b) and Cavaliere (2010a) for an application of the bootstrap where innovations admit conditional heteroscedasticity. An alternative bootstrap procedure able to deal with both conditional heteroscedasticity and serial correlation is the resampling of blocks of autoregressive residuals; see for example Berkowitz and Kilian (2000) for a comprehensive survey on bootstrapping time series with non-IID innovations.
6.3 Testing Linear Restrictions on β
Once the cointegrating rank has been established linear restrictions on cointegrating space can be tested for. In this chapter we focus on the hypothesis \(\mathcal{H}_{0}:\beta = H\varphi\), where \(H\left (n \times s\right )\) (for r ≤ s ≤ n) is a known matrix that specifies that the same restrictions are imposed on all cointegrating vectors, s is the number of unrestricted parameters, and φ is an \(\left (s \times r\right )\) matrix; see Johansen (1995) for a discussion of tests for other hypotheses. An LR test statistic for \(\mathcal{H}_{0}\) can be obtained from the concentrated likelihood function and is given by
where \(\hat{\lambda }_{i}\) and \(\tilde{\lambda }_{i}\) are the usual eigenvalues implied by the maximum likelihood estimation of the restricted and unrestricted models, respectively. \(\Lambda\) is asymptotically distributed as \(\chi ^{2}\left (r\left (n - s\right )\right )\) under the null hypothesis.
As for the trace test, \(\Lambda\) has the correct size asymptotically. However, many studies contain reports that the approximation of the χ 2 distribution to the finite sample distribution of the LR test can be seriously inaccurate; see for example Haug (2002) or Fachin (2000).
One of the early works addressing this issue was the article by Podivinsky (1992). After considering the analogy with the classical linear regression theory, Podivinsky proposed an F-type test. Let
and l be the number of parameters estimated subject to the maintained hypothesis \(\Pi =\alpha \beta ^{{\prime}}\), then
is taken to have an F distribution with \(\left (r\left (n - s\right ),T - l\right )\) degrees of freedom under the null hypothesis.
Alternatively, Johansen (2000) proposed a Bartlett adjustment for the LR test in (6.3) and analytically derived the asymptotic expansions needed to calculate the expectation of the test statistic.
For the null hypothesis under consideration the Bartlett adjustment is given by
where q = r(n − s), \(v\left (\alpha \right ) = tr\left \{\left (\alpha ^{{\prime}}\Omega ^{-1}\alpha \right )^{-1}\sum \nolimits _{\beta \beta }^{-1}\right \}\) with
and the constant \(c\left (\alpha \right )\) is given in Johansen. Thus, \(\Lambda _{B} =\vartheta ^{-1}\Lambda\) is the Bartlett corrected LR statistic.Footnote 1
Multiplying the unadjusted statistic by a factor derived from an asymptotic expansion of the expectation test provides a closer approximation of the resulting adjusted statistic to the χ 2 distribution, thus reducing the size distortion problem. However, simulation results in Johansen indicated that “the influence of the parameters is crucial …. There are parameters points close to the boundary where the order of integration or the number of cointegrating relations change, and where the correction does not work well” (cf. Johansen 2000, p. 741).
The dependency on the parameter values may be reduced by computing the Bartlett adjustment using the non-parametric bootstrap. Important theoretical and empirical results in the literature suggest that size distortion of \(\Lambda\) depends in a complicated way on the number of nuisance parameters as well as the parameter values; see Gredenhoff et al. (2001), Haug (2002) or Gonzalo (1994) among others. Because the bootstrap method involves replacing the unknown cumulative distribution function of the LR test by the empirical distribution function of the same test, the resulting inference procedure may show less sensitivity to the values of the parameters of the DGP than a test based on the asymptotic critical values. The bootstrap Bartlett method was first proposed in Rocke (1989) where hypothesis testing in seemingly unrelated regression models was considered. Rocke’s simulation results showed that the Bartlett adjustment for the LR test determined using the non-parametric bootstrap was considerably more accurate than the Bartlett adjustment from the second-order asymptotic method of Rothenberg (1984).
Calculating the bootstrap Bartlett corrected LR test in (6.3) involves undertaking a simulation study using the constrained estimates of θ, denoted by \(\hat{\theta }= \left (\hat{\alpha },\hat{\beta },\hat{\Gamma }_{i},\hat{\phi },\hat{\rho },\hat{\Omega }\right )\), conditional on the initial values X 0 and \(\Delta X_{0}\), as the true values. Given these estimates and any required starting values, bootstrap data can be generated recursively after resampling residuals. From each generated sample, one obtains a bootstrap analogue of \(\Lambda\) whose average estimates the mean of the test statistic under the null hypothesis.
The steps used to implement the bootstrap algorithm for calculating the bootstrap Bartlett corrected LR test can be summarized as follows:
-
Step (1): Estimate the model in (6.1) and compute \(\Lambda\) and the restricted residuals
$$\displaystyle{ \hat{\varepsilon }_{t} = \Delta X_{t} -\hat{\alpha }\hat{\varphi }^{{\prime}}\left (H^{{\prime}}X_{ t-1} +\hat{\rho } ^{{\prime}}D_{ t}\right ) -\sum \limits _{i=1}^{k-1}\hat{\Gamma }_{ i}\Delta X_{t-i} -\hat{\phi } d_{t}.\ }$$ -
Step (2): Resample the residuals from \(\left (\hat{\varepsilon }_{1},\ldots,\hat{\varepsilon }_{T}\right )\) independently with replacement to obtain a bootstrap sample \(\left (\varepsilon _{1}^{{\ast}},\ldots,\varepsilon _{T}^{{\ast}}\right )\). Generate the bootstrap sample
$$\displaystyle{ \Delta Y _{t}^{{\ast}} =\hat{\alpha }\hat{\varphi } ^{{\prime}}\left (H^{{\prime}}X_{ t-1}^{{\ast}} +\hat{\rho } ^{{\prime}}D_{ t}\right ) +\sum \limits _{ i=1}^{k-1}\hat{\Gamma }_{ i}\Delta X_{t-i}^{{\ast}} +\hat{\phi } d_{ t} +\varepsilon _{ t}^{{\ast}}, }$$recursively from \(\left (\varepsilon _{1}^{{\ast}},\ldots,\varepsilon _{T}^{{\ast}}\right )\) using the estimated restricted model given in (6.2).
-
Step (3): Compute \(\Lambda _{j}^{{\ast}}\) using the data of step (2) and repeat B times.
The two bootstrap alternatives to the use of asymptotic critical values can be implemented as follows:
-
Step (4): Average the observed values \(\Lambda _{1}^{{\ast}},\ldots,\Lambda _{B}^{{\ast}}\) to obtain an estimate, \(\overline{\Lambda }^{{\ast}}\), of the average value of \(\Lambda\). A Bartlett-type corrected statistic is therefore \(\Lambda _{B}^{{\ast}} = \frac{q\Lambda } {\overline{\Lambda }^{{\ast}}}\). The corrected statistic is then referred to a \(\chi ^{2}\left (q\right )\) distribution (with \(q = r\left (n - s\right )\)).
The asymptotic distribution of the bootstrap based procedures introduced in steps (1)–(4) is considered in Canepa (2016), where is it shown that the bootstrap Bartlett corrected test is asymptotically consistent.
Remark.
There has been some discussion in the literature on the use of unrestricted rather than restricted residuals when implementing the bootstrap procedure. Using the unrestricted residuals implies estimating the unconstrained VAR model and generating the pseudo-data on the basis of the estimated unconstrained coefficients. Most bootstrap implementations relating tests for linear restrictions in cointegrated VAR models have been based on resampling from the restricted residuals [see Gredenhoff et al. (2001) among others]. On the other hand, Fachin and Omtzigt (2006) argue that if the null hypothesis is not true resampling from the unrestricted residuals greatly improves the power of the test statistic and should be preferred. The authors also suggest that the Bartlett corrected test should be based on the unrestricted estimates.
In order to check if the power properties \(\Lambda _{B}^{{\ast}}\) were affected by the choice of the resampling procedure a simulation experiment was undertaken comparing the rejection frequencies for the size and power of the bootstrap Bartlett test generated using the unrestricted and the restricted residuals. Simulations results (not reported) showed that there was some loss in power when the bootstrap was based on the restricted DGP. However, it was also found that Type I error control was superior when the bootstrap DGP was based on the restricted estimates. This suggests that, for practical implementations, the choice between the restricted and the unrestricted residuals has to be based on the specific application.
6.3.1 A Monte Carlo Experiment
How good is the bootstrap Bartlett procedure? That is, how well does the Bartlett correction factor approximate the finite sample expectation of \(\Lambda\)? Also, how does it compare with the available analytical correction or the F-type test? Questions of this nature can best be settled by case and simulation studies. We now describe the Monte Carlo study that addresses these questions.
In order to assess the relative performance of the inference procedures under consideration we look at two separate DGPs. The first DGP, labelled DGP1, is a four-dimensional cointegrated VAR with one cointegrated vector given by
where α 11 = 1, β 11 = 0. 4, β 21 = 1. 7, β 31 = 0. 1 and \(\mathbf{\varepsilon }_{it} \sim N(0,\Omega )\) with \(\Omega\) =\(\left (0,I\right )\). The null hypothesis,
is tested against the alternative \(\mathcal{H}_{1}:\beta\) unrestricted. The second DGP, labelled DGP2, is a four-dimensional VAR of rank 2 given by
where α 22 = 1, β 22 = 0. 4, β 32 = 0. 5 and the other parameters as in DGP1. In this case the null hypothesis of interest is
where I is an identity matrix. Once again, under the alternative hypothesis β is unrestricted.
The Monte Carlo experiment was based on N = 10, 000 replications for \(\Lambda\), \(\Lambda _{B}\) and F. The empirical sizes for \(\Lambda _{B}^{{\ast}}\) are generated from N = 1000 and 400 bootstrap replications, that is B = 400. Note that in Johansen’s procedure the maximum likelihood estimator of β in (6.1) is calculated as the set of eigenvectors corresponding to the s largest eigenvalues of S 0k ′ S 00 −1 S 0k with respect to S kk , where S 00, S kk and S 0k are the moment matrices formed from the residuals \(\Delta X_{t}\) and X t−k , respectively, onto the \(\Delta X_{t-j}.\) In an attempt to increase numerical stability the empirical levels of the test statistics reported below were obtained using an algorithm based on QR decomposition; see Doornik and O’Brien (2002).
6.3.1.1 Some Simulation Results
Based on the preceding, the performance of the inference procedures under consideration has been assessed (1) by increasing the number of nuisance parameters in the DGP, and (2) by varying the values of some key parameters of the DGP. Tables 6.1 and 6.2 report the simulation results for the empirical sizes of \(\Lambda\), \(\Lambda _{B}\) and \(\Lambda _{B}^{{\ast}}\). The significance levels have been estimated for nominal levels of 5% and all estimates are given as percentages.
The impact of the nuisance parameters has been assessed by considering the effect of increasing the number of lags in the DGP. That is, starting with DGP1 with k = 1, the number of lags has been progressively increased up to 3. The same experiment has then been repeated with DGP2.
As far as the simulation results are concerned the first thing to note in Table 6.1 is that inference based on first-order asymptotic critical values is markedly inaccurate with excessively high rejection rates. Increasing the number of lags, k, dramatically increases the deviation from the nominal levels. By contrast, allowing more cointegrating vectors, r, in the system slightly reduces the size distortion. Turning to empirical sizes for \(\Lambda _{B}\), \(\Lambda _{B}^{{\ast}}\) and F, we can see that they are much closer to the nominal sizes than the first-order asymptotic critical values. Overall, the results in Table 6.1 show that correcting the LR test statistic is worthwhile, since all the empirical sizes reported for the corrected test are closer to the nominal 5% level than the unadjusted test statistic. However, introducing many nuisance parameters in the model affects the size accuracy of all test statistics under consideration.
Turning to the second group of experiments, we now evaluate the sensitivity of the test statistics to variations of key parameters in the DGP. Our target is to detect regions of the parameter space where a large size distortion is more likely to occur. In order to maintain a high degree of control on the experimental design we restrict our attention to the DGP1 with k = 1. Note that, for a given sample size, the test performance in models with more nuisance parameters would be no better than that in a more parsimonious DGP.
In the simulation design, a problem we face is that DGP1 depends on a relatively large number of parameters. One way to address this issue is to change the coordinates of the system and rotate the VAR(1). This transformation leaves the statistical analysis of the model unchanged, but it does lead to a reduction in the number of parameters without loss of generality. Under \(\mathcal{H}_{0}:\beta = H\varphi\), the model in (6.5) can be rotated into
so that, for fixed T, the distribution of X t depends on β 11 and β 21 only.
Table 6.2 shows the simulation results for several values of β 11 and β 21. The parameter space experimented with is \(\beta _{11} \in \left (0,-0.2,-1.5,-1.9\right )\) and \(\beta _{21} \in \left (0.1,1.5,1.9\right )\).
The top panel of Table 6.2 shows that when the cointegration relationship becomes weaker \(\Lambda _{B}\) is largely oversized. On the other hand, when the DGP is close to an I(2) process (i.e. β 11 is close to zero and β 21 ≠ 0) the test statistic under-rejects, for β 11 < −0. 5 and β 21 > 0, \(\Lambda _{B}\) has empirical levels within the 95% confidence interval. The F-type test, while able to control for the dependence on the number of estimated parameters, does not take into account the magnitude of the parameters in itself, thus the performance of this test is mixed. When β 11 → 0 and β 21 → 0 the test is the worst performed, but it is well behaved in the region β 11 < −0. 5 and β 21 > 0. Using the bootstrap to approximate the Bartlett adjustment factor produces estimated levels that are less sensitive to the value of the parameters: from Table 6.2 it appears that the region where the Bartlett correction is useful is approximately β 11 < −0. 5 and β 21 > 0.
To wrap up the discussion, simulation results in Tables 6.1 and 6.2 show that in finite samples \(\Lambda\) is highly oversized. The error in rejection probability of the test statistic crucially depends on the parameter values of the DGP, and increasing the number of nuisance parameters worsens the performance of the test in finite samples. \(\Lambda _{B}\) offers improvements over the uncorrected statistic, but its behaviour mimics the performance of \(\Lambda\), thus it is not entirely reliable in regions where the properties of the stochastic process change. In contrast, \(\Lambda _{B}^{{\ast}}\) is less sensitive to the parameter values of the DGP.
6.3.1.2 The Probability of a Type II Error
It is well known that the Bartlett correction factor is designed to bring the actual size of asymptotic tests close to their respective nominal size, but it may lead to a loss in power. It is therefore of interest to evaluate the power properties of the inference procedures considered.
The evaluation of the power of the test statistics has been carried out by generating the data under the following alternatives: \(\mathcal{H}_{1}:\beta _{41} = 0.15\) and \(\mathcal{H}_{1}:\beta _{41} = 0.3\) with r = 1, k = 1, 2, 3, and T = 50, 100, 150. From Table 6.3, the rejection frequencies increase with the sample sizes and the distance between the null and the alternative. The rejection frequencies for the larger sample size, T = 100 say, are reasonable for both alternatives. Turning to the comparison of the rejection frequencies for the power among the different procedures, from Table 6.3 it emerges that correcting the test statistics for the size reduces their power. However, there is evidence that \(\Lambda _{B}\), \(\Lambda _{B}^{{\ast}}\) and the F-type test share similar power properties, with no test uniformly outperforming the competitor. The results for the sensitivity of the parameter space are not reported here, but they do show that a slow adjustment to the equilibrium worsens the rejection frequencies for all the finite sample inference procedures under consideration.
Before looking at an empirical application, a few points need to be made con cerning the use of the bootstrap algorithm described above. The bootstrap Bartlett procedure, used for testing \(\mathcal{H}_{0}:\beta = \left (H\varphi \right )\), is a general tool and can be readily applied to all tests that allow for the imposition of the null hypothesis in the bootstrap resampling procedure, such as the hypothesis on the adjustment coefficients α. This is not to say, however, that the method can be used in a mindless fashion. First, it should be noted that resampling and testing is done for a given cointegrating rank; hence it is assumed that all unit roots have been eliminated and the bootstrap is a stationary one. This requires that the roots of the characteristic equation of (6.1) are equal to 1 or outside the unit circle. Therefore, before undertaking the resampling procedure it is necessary to check if the root condition is satisfied. This will ensure that the pseudo-observation from the recursive scheme described below are in fact I(1) variables. Failure to do so will result in the bootstrap sample becoming explosive.
Further, as mentioned in Sect. 6.2, by using the empirical distribution function in place of some specific parametric distribution, the non-parametric bootstrap does not require a choice of error distribution. This feature is especially appealing to applied econometricians working with financial data. However, for such a method to be valid it is necessary that the DGP used for drawing bootstrap samples mimics the features of the underlying DGP. Therefore, the procedures described above need to be modified when the innovations are not independent or not identically distributed.Footnote 2
6.4 An Empirical Application
To illustrate how the small sample procedures described in this chapter work in practice we consider an application to interest rate theory. According to the uncovered interest parity theory (UIP) the difference between the nominal interest rates of two bonds in two different countries denominated in different currencies should be determined by the expected relative change in the associated exchange rate. That is,
where i is the log of the foreign long-term interest rate, i ∗ is the domestic nominal long-term interest rate, E is the log of the spot exchange rate (home currency price of a unit of foreign currency) and E e is the expected exchange rate. Thus, Eq. (6.6) implies that if the UIP mechanism is functioning one should observe the tendency of the two interest rates to adjust towards the long-run equilibrium level of exchange rates, meaning that \(\left (E -\left (i - i^{{\ast}}\right )\right )\) should be a stationary stochastic process.
As an application, the ten-year bond rate for Japan and the USA have been considered. The dataset contains quarterly data from 1994:2 to 2006:4 for a nominal dollar exchange rate for the UK \(\left (E_{t}\right ),\) the domestic long-rate interest rate \(\left (i_{t}\right ),\) and the US bond rate \(\left (i_{t}^{{\ast}}\right )\).
The first two columns in the top of Table 6.4 summarize the misspecification tests for the unrestricted VAR
estimated with an unrestricted constant, two lags and T = 50. The diagnostic tests involve: F ar for the hypothesis that there is no serial correlation against the fourth-order autoregression, χ no 2 that residuals are normally distributed, F arch that there is no autoregressive conditional heteroscedasticity (against the fourth order) and χ het 2 for the hypothesis that there is no heteroscedasticity. Looking at the misspecification tests it emerges that F ar does not reject the null hypothesis of no autocorrelation against fourth-order autoregression for both equations determining \(\left (i_{t} - i_{t}^{{\ast}}\right )\) and E t . Also, there is no evidence of heteroscedasticity of the ARCH type given that the χ no 2 and F arch-tests do not reject the null hypotheses for these equations. There is, however, some evidence of non-normality of E t .
On the basis of the rank tests reported in the middle panel it is possible to accept the hypothesis that there is one cointegration vector, since the trace statistic associated with the null hypothesis that r = 0 is rejected at 5%.
In the bottom panel of Table 6.4 the empirical sizes associated with the p-values for \(\Lambda,\Lambda _{B},\Lambda _{B}^{{\ast}}\) and F are reported. The null hypothesis under consideration is that \(\left (i_{t} - i_{t}^{{\ast}}\right ) - E_{t}\) is stationary, or equivalently that the vector (1, −1)′ ∈ sp(β). This can be formulated as the hypothesis \(\mathcal{H}_{0}:\beta = H\hat{\varphi }\) with \(H^{{\prime}} = \left [\begin{array}{cc} 1& - 1 \end{array} \right ]\) against the alternative \(\mathcal{H}_{1}:\beta\) unrestricted. The empirical size for \(\Lambda _{B}^{{\ast}}\) has been calculated using the algorithm in Sect. 6.3 with B = 5000. The p-values for \(\Lambda\) have been calculated by taking the 95% quantile from the \(\chi ^{2}\left (1\right )\) and calculating the actual p-value as the frequency of rejection. The test statistic has then been corrected by the Bartlett correction factor using (6.4) and the rejection frequency has been calculated again, thus providing the corrected p-values for \(\Lambda _{B}\).
As far as the \(\Lambda\) is concerned, from Table 6.4 it appears the performance of the test is affected by the sample size since the null hypothesis is rejected. The F test still has a p-value less than 0.05, thus performing slightly better than \(\Lambda\). On the other side, both \(\Lambda _{B}\) and \(\Lambda _{B}^{{\ast}}\) do not reject the UIP hypothesis.
6.5 Conclusion
In this chapter we have reviewed several inference procedures that can be used to reduce the finite sample size distortion problem when making inferences on cointegrated VAR models. Through Monte Carlo simulations it is shown that for small to moderate sample sizes inference based on first-order asymptotic approximation is largely inaccurate and the size distortion of the asymptotic test dramatically worsens when more nuisance parameters are included in the DGP. In addition, the performance of the LR test is sensitive to the magnitude of the cointegrating coefficients. The Bartlett corrected test suggested in Johansen (2000) dramatically improves the behaviour of the LR procedure in several instances, but still leaves something to be desired for parameter points close to the boundary where the order of integration changes or the number of cointegrating relations changes. On the other side, the bootstrap Bartlett corrected test appears to be less sensitive to the values of the parameters of the DGP than its analytical counterpart. Finally, for small sample sizes the results for the F-type test are mixed.
The Monte Carlo experiment is limited to the LR test for linear restrictions on the cointegrating space. However, the same considerations should be applied to the trace test. In the next chapter the impact of volatility is accounted for in the model estimations.
Notes
- 1.
Note that in Johansen (2002a) a Bartlett correction factor for \(\Lambda\) is derived under the assumption that the adjustment parameter α is known. Although theoretically interesting, the resulting test statistic is less relevant for applied work. Accordingly, in this chapter we restrict our attention to Johansen (2000) where this assumption is dropped.
- 2.
See Canepa (2012) for extensive simulation results, including the case where innovations are heteroscedastic.
References
Ahn, S. K., & Reinsel, C. G. (1990). Estimation for partially nonstationary multivariate autoregressive models. Journal of the American Statistical Association, 85, 813–823.
Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. Proceedings of the Royal Society, 160, 263–268.
Berkowitz, J., & Kilian, L. (2000). Recent developments in bootstrapping time series. Econometric Reviews, 19, 1–48.
Canepa, A. (2012). Robust Bartlett Adjustment for Hypotheses Testing on Cointegrating Vectors. Economics and Finance Working Paper Series, Working Paper No. 12-10.
Canepa, A. (2016). A note on Bartlett correction factor for tests on cointegrating relations. Statistics and Probability Letters, 110, 295–304.
Cavaliere, G., Rahbek, A., & Taylor, A. M. R. (2010a). Cointegration rank testing under conditional heteroskedasticity. Econometric Theory, 26, 1719–1760.
Cavaliere, G., Rahbek, A., & Taylor, A. M. R. (2010b). Testing for co-integration in vector autoregressions with non-stationary volatility. Journal of Econometrics, 158, 7–24.
Cheung, Y.-W., & Lai, K. S. (1993). Finite-sample sizes of Johansen’s likelihood ratio tests for cointegration. Oxford Bulletin of Economics and Statistics, 55, 313–328.
Cribari-Neto, F., & Cordeiro, G. (1996). On Bartlett and Bartlett-type corrections. Econometric Reviews, 15, 339–401.
Doornik, J. A., & O’Brien, R. J. (2002). Numerically stable cointegration analysis. Computational Statistics & Data Analysis, 41, 185–193.
Fachin, S. (2000). Bootstrap and asymptotic tests of long-run relationships in cointegrated systems. Oxford Bulletin of Economics and Statistics, 62, 543–551.
Fachin, S., & Omtzigt, P. (2006). The size and power of bootstrap and Bartlett-corrected tests of hypotheses on the cointegrating vectors. Econometric Reviews, 25, 41–60.
Gonzalo, J. (1994). Comparison of five alternative methods of estimating long-run equilibrium relationships. Journal of Econometrics, 16, 203–233.
Gredenhoff, M., & Jacobson, T. (2001). Bootstrap testing linear restrictions on cointegrating vectors. Journal of Business & Economics Statistics, 19, 63–72.
Hall, P. (1992). The bootstrap and Edgeworth expansion. New York: Springer.
Harris, R. I. D., & Judge, G. (1998). Small sample testing for cointegration using the bootstrap approach. Economics Letters, 58, 31–37.
Haug, A. A. (1996). Tests for cointegration: A Monte Carlo comparison. Journal of Econometrics, 71, 89–115.
Haug, A. A. (2002). Testing linear restrictions on cointegrating vectors: Sizes and powers of wald and likelihood ratio tests in finite samples. Econometric Theory, 18, 505–524.
Johansen, S. (1995). Likelihood-Inference in Cointegrated Vector Auto-Regressive Models. Oxford: OUP.
Johansen, S. (2000). A Bartlett correction factor for tests of on the cointegrating relations. Econometric Theory, 16, 740–778.
Johansen, S. (2002). A small sample correction for the test of cointegrating rank in the vector autoregressive model. Econometrica, 70, 1929–1961.
Johansen, S. (2002a). A small sample correction for tests of hypotheses on the cointegrating vectors. Journal of Econometrics, 111, 195–221.
Mantalos, P., & Shukur, G. (2001). Bootstrapped Johansen tests for cointegration relationships: A graphical analysis. Journal of Statistical Computation and Simulation, 68, 351–371.
Park, J. Y. (2003). Bootstrap unit root tests. Econometrica, 6, 1845–1895.
Podivinsky, J. M. (1992). Small sample properties of tests of linear restrictions on cointegrating vectors and their weights. Economics Letters, 39, 13–18.
Reimers, H.-E. (1992). Comparisons of tests for multivariate cointegration. Statistical Papers, 33, 335–359.
Rocke, D. M. (1989). Bootstrap Bartlett adjustment in seemingly unrelated regression. Journal of the American Statistical Association, 84, 598–601.
Rothenberg, T. J. (1984). Hypothesis testing in linear models when the error covariance matrix is non-scalar. Econometrica, 52, 827–842.
Swensen, A. R. (2006). Bootstrap algorithms for testing and determining the cointegration rank in VAR models. Econometrica, 74, 1699–1714.
van Giersbergen, N. P. A. (1996). Bootstrapping the trace statistics in VAR models: Monte Carlo results and applications. Oxford Bulletin of Economics and Statistics, 58, 391–408.
Author information
Authors and Affiliations
Copyright information
© 2017 The Author(s)
About this chapter
Cite this chapter
Hunter, J., Burke, S.P., Canepa, A. (2017). Testing in VECMs with Small Samples. In: Multivariate Modelling of Non-Stationary Economic Time Series. Palgrave Texts in Econometrics. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-31303-4_6
Download citation
DOI: https://doi.org/10.1057/978-1-137-31303-4_6
Published:
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-0-230-24330-9
Online ISBN: 978-1-137-31303-4
eBook Packages: Economics and FinanceEconomics and Finance (R0)