Abstract
This article shows that the interpretation of statistical evidence of regime-switching is not unambiguous. The usual interpretation is that some parameters switch according to the values of a predefined latent variable. An alternative interpretation is that regime-switching, as a statistical evidence, is also possible when the linear model is underspecified and the omitted variable bias emerges. A formal test is proposed to verify a potentially spurious regression with regime-switching. Through this test, it is evident that regime-switching estimates presented in an academic paper, should be interpreted as a consequence of the misspecification considered here.
Similar content being viewed by others
Notes
Only regime-switching models with unobservable switching are considered.
The chapter is called: Macroeconomics, Non-linear Time Series in.
The identification problem arises when specifying the null hypothesis as a linear model given R-S estimates. This arises in particular from nuisance parameters which are the transitional probabilities that are unidentified under the null. This in turn makes the information matrix singular (under the null). Thus the standard inference tools cannot apply.
y, x and z are defined as stochastic variables such that the usual assumptions of the classical linear regression model apply. The model describes the causal relationship between the relevant variables.
The considered data-generating processes imply that the underlying parameters measure the causal effects between variables.
The usual definition of the plim operator applies. For references, see White (1999, chapter 2). Throughout the article, I intend the plim value of a matrix to constitute the plim value of each element of the matrix.
\(s_n\) is drawn from a discrete distribution whose support is \(1,2,\ldots r\), independently of X.
Where \({{\varvec{E}}}[.|{{\varvec{Y}}},{{\varvec{X}}},{\varvec{\rho }} _{\mathbf{{1}},{{\varvec{i}}}}]\) denotes the mathematical expectation operator conditional on realizations of Y and X and on the relevant parameters vector \(\rho _{1,i}\): \(\rho _{1,i} =[ \rho _{0,i} \quad \theta _i ]\) with \(\theta _i \equiv \left[ {\beta _i \sigma } \right] \) whose elements are defined in A8 and A9.
For concepts specific to the RSM, the reader may consider (Hamilton 1994, chapter 22).
This is due to exogeneity of \({{\varvec{X}}}\) assumed in A2 and in A9.
ID stays identically distributed.
Note that this is a sufficient (but not necessary) condition for applying the Central Limit Theorem in appendix.
The R-S estimators exist under the condition that \(\left( {{{\varvec{X}}}_\mathbf{{1}}^{{\prime }} {\widehat{{{\varvec{P}}}}}_{{\varvec{i}}} {{\varvec{X}}}_\mathbf{{1}}} \right) ^{-1}\) exists, \(i=1,2,\ldots ,r\). Note that the result of Lemma 2 presented later on is not necessary for its existence, implying that the econometric model does not need to be specified correctly.
The estimated smoothed probabilities of each regime and for each observation are organized in a diagonal matrix \(\widehat{{{\varvec{p}}}}_{{\varvec{i}}}\) consistently with A6 (where the main diagonal corresponds to the vector of the inferred smoothed probabilities conditional on regime i).
A similar analysis is provided in Timmerman (2000), but only for some R-S models.
\(\varvec{\rho }_{\mathbf{{1},{\varvec{i}}}}=[{\varvec{\rho }} _\mathbf{0 ,{{\varvec{i}}}}{\varvec{\theta }}_{{\varvec{i}}}]\, \hbox {with}\,{\varvec{\theta }}_{{\varvec{i}}} \equiv {\varvec{\beta }}_{{\varvec{i}}} \sigma \).
They are still organized in r matrices:\({{\varvec{S}}}_{{\varvec{i}}}\) is a \(N{\times }N\) diagonal matrix, such that the nth-element of the main diagonal is 1 if the nth-observation belongs to regime i\((i=1,2,\ldots ,r)\) and zero otherwise.
See Eq. (2.10).
I.I.D.N. stays independently and identically normally distributed.
The sample is 1965m1–2004m11.
The reader should consider the absolute values, because the employed regressors may be negatively correlated with each other, although all measure the stance of the monetary policy. Thus, the covariances between each of them and the omitted variable may have different signs and so the bias.
The reader should compare \(\mathop \sum \nolimits _{i=0}^1 {\hat{\varvec{\pi }}}_i \widehat{\varvec{\gamma }}_{{\varvec{i}}}\, \hbox {with}\, {\varvec{\gamma }}_{{{\varvec{ols}}}}^{{\varvec{B}}}\).
Very similar estimates are obtained here to those of Chen (2007).
When recessions occur, the indicator equals \(-\,1\); when the expansions occur, the indicator equals \(+\,1\).
Table 3 (last row) reports these correlations.
Note that these estimates are equal to those reported in the first row of Table 3.
Note that Table 1 only reports the slope of the regressions: and , meanwhile, the test considers the entire vector of parameters.
There a constant was assumed in the linear model equal to zero. Here, the reader may verify that the average estimated constant of the R-S \(\mathop \sum \nolimits _{{i}=1}^1 \hat{\varvec{\pi }}_{{\varvec{i}}} {\widehat{{\alpha }}}_{i}\) regressions, is asymptotically equal to the parameter of the NBER indicator multiplied by its expected value, \(\mathop \sum \nolimits _{{i}=1}^r {\pi }_{i} \alpha z_{i}=\alpha \mathop \sum \nolimits _{i=1}^r \pi _i z^i=\alpha E[z]\), plus the constant of the linear model. Where \(z^0=-\,1\) and \(z^1 =+\,1\) as specified in Note 27.
Indeed, it cannot be concluded that this relationship is linear, as the monetary policy variables may still have nonlinear effects (although non R-S) on the stock market. The easiest way to verify this is to insert as regressor an interactive variable (which is the product of the NBER indicator and the measures of monetary policy). However, these interacting variables are never significant (results available upon request).
In fact, the NBER indicator cannot be employed for forecasting purposes, since it is delivered several months later with respect to its reference period. See also the conclusions of Morley et al. (2013) on this point.
To see this, note that the matrices \( i=1,\ldots r,\) provide a partition of the sample consistent with the postulated regimes.
This is equivalent to differentiating the first order conditions found in Hamilton (1990) in eq. (5.8). This is also equivalent to considering the transformation of the row data: \(\tilde{{{\varvec{y}}}}=(\hat{{\varvec{p}}}_{{\varvec{i}}})^{0.5} {{\varvec{y}}},\,{\tilde{{\varvec{x}}}}=(\hat{{{\varvec{p}}}}_{{\varvec{i}}})^{0.5}\)x, and then finding the OLS variance-covariance matrix of \((\tilde{{{\varvec{x}}}}^\prime \tilde{{{\varvec{x}}}})^{-1}(\tilde{{{\varvec{x}}}}^\prime \tilde{{{\varvec{y}}}})\).
Note first that the matrices \(\hat{{{\varvec{p}}}}_{{\varvec{i}}}\) are functions of \({{\hat{\varvec{{\rho }}}}}_{\mathbf{{1},{\varvec{i}}}}\)\((i=1,2,\ldots ,r)\) with \(\hat{{\varvec{\rho }}}_{\mathbf{{1},{\varvec{i}}}}\equiv [\hat{{\varvec{\rho }}}_\mathbf{0 , {{\varvec{i}}}} \quad \hat{{\varvec{\theta }}}_{{\varvec{i}}}]\) and \(\hat{{\varvec{\theta }}}_{{\varvec{i}}}\equiv [\hat{{\varvec{\beta }}_{{\varvec{i}}}}\, \hat{\sigma }]\) which are estimators of the parameters defined in A6, A8, and A9. Having then assumed that the relevant density and probability functions are continuous, I apply proposition 2.27 of White (1991), chapter 2.
See White (1991, chapter 2), Proposition 2.27.
Which is the result of the estimating procedure of Kim et al. (2008).
\({\varvec{\beta }} _{{1}} =\left[ {0.5 0.8} \right] ^{{{\prime }}},{\varvec{\beta }} _2 =\left[ {-0.5-0.8} \right] {{\prime }} \quad \pi _1 =0.6\), \(\Pr \left[ {s_t =1 {|}s_{t-1} =1} \right] =0.8\), \(\Pr \left[ {s_t =2 |s_{t-1} =2} \right] =0.7\), \(\sigma _1^2 =0.5, \quad \sigma _2^2 =2\).
\({\varvec{\beta }}_\mathbf{1} =\left[ {0.5 0.8} \right] ^{{{\prime }}}, {\varvec{\beta }}_2 =\left[ {-0.5-0.8} \right] {{\prime }} \quad \pi _1 =0.6\), \(\Pr \left[ {s_t =1{|}s_{t-1} =1} \right] =0.8\), \(\Pr \left[ {s_t =2{|}s_{t-1} =2} \right] =0.7\), \(\sigma _1^2 =0.5, \quad \sigma _2^2 =0.5\).
\({\varvec{\beta }}_\mathbf{1} =\left[ {0.5 0.8} \right] ^{{{\prime }}}, {\varvec{\beta }}_2 =\left[ {0.5-0.8} \right] {{\prime }} \quad \pi _1 =0.6\), \(\Pr \left[ {s_t =1 {|}s_{t-1} =1} \right] =0.8\), \(\Pr \left[ {s_t =2 {|}s_{t-1} =2} \right] =0.7\), \(\sigma _1^2 =0.5, \quad \sigma _2^2 =2\). The case of constant variance is also considered. This does not exhibits different behavior with respect to the simulations with R-S variance.
References
Beccarini A (2010) Eliminating the omitted variable bias by a regime-switching approach. J Appl Stat 37:57–75
Carrasco M, Hu L, Ploberger W (2014) Optimal test for Markov switching parameters. Econometrica 82:765–784
Chen SS (2007) Does monetary policy have asymmetric effects on stock returns? J Money Credit Bank 39:667–688
Di Sanzo S (2007) Testing for linearity in Markov switching models: a bootstrap approach. Stat Method Appl 18:153–168
Greene WH (2008) Econometric analysis. Prentice Hall, Upper Saddle River
Hamilton JD (1989) A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57:357–84
Hamilton JD (1990) Analysis of time series subject to changes in regimes. J Econom 45:39–70
Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton
Kim CJ, Nelson CR (2000) State-space models with regime-switching. The MIT Press, Cambridge
Kim CJ, Piger J, Startz R (2008) Estimation of Markov regime-switching regression models with endogenous switching. J Econom 143:263–273
Krolzig HM (1997) Markov-switching vector autoregressions: modeling, statistical Inference, and application to business cycle analysis. Springer, Berlin
Meyers RA (2011) Complex systems in finance and econometrics. Springer, New York
Morley J, Piger J, Tien PL (2013) Reproducing business cycle features: are nonlinear dynamics a proxy for multivariate information? Stud Nonlinear Dyn E 17:483–498
Redner RA, Walker HF (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26:195–239
Timmermann A (2000) Moments of Markov switching models. J Econom 96:75–111
White H (1991) Asymptotic theory for econometricians. Academic Press, San Diego
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Proofs
Proof of Lemma 1
Note that \(var\left[ {{\widehat{\varvec{\beta }}} _{\varvec{i}} |{{\varvec{X}}}_\mathbf{{1}} ,{{\varvec{S}}}_{{{\varvec{i}}}}}\right] =\sigma ^{2}\left( {{{\varvec{X}}}_\mathbf{{1}} ^{{\prime }}{{\varvec{S}}}_{{{\varvec{i}}}}{{\varvec{X}}}_\mathbf{{1}}} \right) ^{-1}\), \(i=1,2,\ldots r\); holds by constructionFootnote 33. Suppose now that, in order to estimate the above variance, the inverse of the negative of the second derivative of the log-likelihood function (Hessian matrix) evaluated at the maximum likelihood estimator is chosen. Then, by twice differentiating the expected log-likelihood (see Hamilton 1990) with respect to the parameter vector, I obtainFootnote 34 expression (3): \( \widehat{{var}}\left[ {{\widehat{\varvec{\beta }}} _{\varvec{i}} |{{\varvec{X}}}_\mathbf{{1}} ,\widehat{{{\varvec{P}}}}_{{{\varvec{i}}}}} \right] =\widehat{\sigma }^{2}\left( {{{\varvec{X}}}_\mathbf{{1}}^{{\prime }}{\widehat{P}}_{{{\varvec{i}}}} {{\varvec{X}}}_\mathbf{{1}} } \right) ^{-1}\). \(\square \)
Proof of Lemma 2
If \(plim\widehat{{\varvec{\rho }}}_{{\mathbf{{1},{{\varvec{i}}}}}} ={\varvec{\rho }} _{{\mathbf{{1},{{\varvec{i}}}}}}\), the corresponding estimated smoothed probabilities are consistent, that is:Footnote 35
If the regimes are separated (see Redner and Walker 1984), then:
since, by the definition of Separation, it follows that each element of the main diagonal of \({{\varvec{P}}}_{{\varvec{i}}}\) must be either 0 or 1, coherently with the regime occurrence. The combination of A.1 and A.2 yields: \(plim \widehat{P}_i =S_i \) .
It also follows that \(plim\frac{tr(\widehat{P}_i)}{N}=plim{\hat{{{\pi }}}} _{i} =\frac{tr\left( {S_i } \right) }{N}=\pi _i.\)\(\square \)
Proof of Lemma 3
Define \(q_{l,m} \) the lth-row mth-column element of \(\frac{{{\varvec{X}}}_\mathbf{{1}}^{{\prime }}{{\varvec{X}}}_\mathbf{{1}}}{N}\) and \(p_{l,m}\) the corresponding element of \(\frac{{{\varvec{X}}}_\mathbf{{1}}^{{\prime }}{{\varvec{P}}}_{{\varvec{i}}} {{\varvec{X}}}_\mathbf{{1}}}{N}\) (with \( l,m=1,\ldots ,K+1\)). Since each element of the main diagonal of \(\widehat{{{\varvec{P}}}}_{{\varvec{i}}}\) is a probability, it implies that \(|p_{l,m} \left| \le \right| q_{l,m} |\forall l,m\); furthermore it holds, by A3, that \(q_{l,m} <\infty ,\forall l,m.\) Thus, all conditions of the Weierstrass theorem hold, which states that each \(p_{l,m} \) converges (in probability) to a well defined value. All of these values are included in the \((k~+~1){\times }(k~+~1)\) matrix \(Q_i \). \(\square \)
Proof of Lemma 4
Part 1. Given \(\widehat{\sigma }^{2}\) and assuming \(plim \widehat{\sigma }^{2}=\sigma ^{2}\), and exploiting the results of Lemma 2, the asymptotical value of the estimated variance can be obtained by simply substituting \(plim\widehat{\sigma }^{2}=\sigma ^{2}\) and by considering the result of Lemma 2 in expression (3) of Lemma 1 . The fact that \(var\left[ {{\widehat{\varvec{\beta }}} _{\varvec{i}} |{{\varvec{X}}}_\mathbf{{1}} ,{{\varvec{S}}}_{{{\varvec{i}}}} } \right] =\sigma ^{2}\left( {{{\varvec{X}}}_\mathbf{{1}}^{{\prime }}{{\varvec{S}}}_{{{\varvec{i}}}} {{\varvec{X}}}_\mathbf{{1}} } \right) ^{-1}\), \({i=1,2,\ldots r}\), is already established at the beginning of Lemma 1.
Part 2. Note that from Lemma 3, I obtain that \(plim{{\varvec{X}}}_\mathbf{{1}} ^{{\prime }}\widehat{{{\varvec{P}}}}_{{\varvec{i}}} {{\varvec{X}}}_\mathbf{{1}} ={{\varvec{V}}}_{{\varvec{i}}}\) where each element of the \((K+1)\times (K+1)\) matrix \({{\varvec{V}}}_{{\varvec{i}}}\) goes to infinity. The same applies for \(plim{{\varvec{X}}}_\mathbf{{1}}^{{\prime }}{{\varvec{X}}}_\mathbf{{1}}\). Since, \(0<\sigma ^{2}<\infty \) (see A9), this leads us to conclude that the plim values of the corresponding variances are zero. \(\square \)
Proof of Lemma 5
First, note that by definition, under model A),
Conditional on \({{\varvec{S}}}_{{\varvec{i}}} (i=1,2,\ldots ,r)\), the following holds:
since \({{\varvec{S}}}_{{\varvec{i}}}\mathop \sum \nolimits _{j=1}^r {{\varvec{S}}}_{{\varvec{j}}} {{\varvec{X}}}_\mathbf{{1}} {\varvec{\beta }}_{{\varvec{j}}} =\mathop \sum \nolimits _{j=1}^r {{\varvec{S}}}_{{\varvec{i}}} {{\varvec{S}}}_{{\varvec{j}}} {{\varvec{X}}}_\mathbf{{1}}{\varvec{\beta }}_{{\varvec{j}}} ={{\varvec{S}}}_{{\varvec{i}}} {{\varvec{S}}}_{{\varvec{i}}} {{\varvec{X}}}_\mathbf{{1}} {\varvec{\beta }}_{{\varvec{i}}}={{\varvec{S}}}_{{\varvec{i}}} {{\varvec{X}}}_\mathbf{{1}} {\varvec{\beta }}_{{\varvec{i}}} \), then
Since \(E\left[ {{{\varvec{U}}}|{{\varvec{S}}}_{{\varvec{i}}} ,{{\varvec{X}}}_\mathbf{{1}}} \right] =0\) (see A9), the following holds:
Now, from Lemma 2: \(plim\widehat{{{\varvec{P}}}}_{{\varvec{i}}} ={{\varvec{S}}}_{{\varvec{i}}}\), thus, according to the theorem of converge in probability of continuous functionsFootnote 36 it also holds:
Equation (A.4) and the result of Lemma 4.2 provide sufficient conditions for the limit in probability of \({\widehat{\varvec{\beta }}} _{\varvec{i}}\):
\(\square \)
Proof of Proposition 1
For the OLS estimator, it holds:
In order to find \(E\left[ {{\widehat{\varvec{\beta }}} _{{\varvec{Ols}}} |{{\varvec{X}}}_\mathbf{{1}}} \right] \), the following relationship may be used:
Again, the condition based on \(S_i \)\({(i=1,2,\ldots ,r)}\) is allowed by assuming Lemma 2. Thus, since \(E\left[ {{{\varvec{U}}}|{{\varvec{S}}}_{{\varvec{i}}} ,{{\varvec{X}}}_\mathbf{{1}}} \right] =0,\)(see A9), it holds:Footnote 37
Equation (A.8) and Lemma 4.2 provide sufficient conditions for convergence in probability of \({\widehat{\varvec{\beta }}} _{{\varvec{Ols}}} \): \(plim{\widehat{\varvec{\beta }}} _{{\varvec{Ols}}} =\mathop \sum \nolimits _{j=1}^r \pi _j {\varvec{\beta }}_{{\varvec{j}}} \). Lemmas 2 and 5 imply \(plim\mathop \sum \nolimits _{j=1}^r {\hat{{{\pi }}}} _{\varvec{j}} {\widehat{\varvec{\beta }}}_{\varvec{j}} =\mathop \sum \nolimits _{j=1}^r\pi _j {\varvec{\beta }} _{{\varvec{j}}}\). \(\square \)
Proof of Proposition 2
Note that, under model B) \(plimE\left[ {{\widehat{\varvec{\beta }}} _{{\varvec{Ols}}} |{{\varvec{X}}}_\mathbf{{1}}} \right] \ne {\ddot{{\varvec{\beta }}}_{{\varvec{i}}}} (i=1,2,\ldots ,r)\). However, assuming Lemma 2, Beccarini (2010) has shown that: \(E\left[ {{\widehat{\varvec{\beta }}} _{\varvec{i}} |{{\varvec{S}}}_{{\varvec{i}}} ,{{\varvec{X}}}_\mathbf{{1}}}\right] ={\ddot{{\varvec{\beta }}}}_{{\varvec{i}}}\). This result and Lemma 4.2 provide sufficient conditions for convergence in probability. \(\square \)
Proof of Proposition 3
First, the consistency of \({\widehat{{{\varvec{E}}}}}\left[ {{{\varvec{U}}}|{{\varvec{X}}}_\mathbf{{1}}} \right] =\mathop \sum \nolimits _{i=1}^r {\widehat{{{\varvec{E}}}}} \left[ {{{\varvec{U}}}|{{\varvec{S}}}_{{\varvec{i}}} ,{{\varvec{X}}}_\mathbf{{1}}} \right] \widehat{{{\varvec{P}}}}_{{\varvec{i}}}\) must be established. It follows from the fact that \({\widehat{E}} \left[ {{{\varvec{U}}}|{{\varvec{S}}}_{{\varvec{i}}},{{\varvec{X}}}_\mathbf{{1}}} \right] \) is consistent;Footnote 38 Lemma 2 (\(plim \widehat{{{\varvec{P}}}}_{{\varvec{i}}} ={{\varvec{S}}}_{{\varvec{i}}}\)) must also be assumed.
Consistency of \({{{\varvec{\delta }}}}_{{\varvec{j}}}\) is established in Kim et al. (2008) if their estimating procedure is applied; it remains to establish the asymptotic properties of \(\tilde{{\varvec{\delta }}}_{{\varvec{Ols}}}\) (the OLS estimator of the A.1 model). Note now,
since \(plim\left( {{{\varvec{X}}}_\mathbf{3 }^{{{\prime }}}{{\varvec{X}}}_\mathbf{3 } } \right) ^{-1}\left( {{{\varvec{X}}}_\mathbf{3 }^{{{\prime }}}\left[ {{{\varvec{U}}}-{\widehat{{{\varvec{E}}}}} \left[ {{{\varvec{U}}}|{{\varvec{X}}}_\mathbf{{1}}} \right] } \right] } \right) =0\), then \(plimE\left[ \tilde{{\varvec{\delta }}} _{{\varvec{Ols}}} |{{\varvec{S}}}_{{\varvec{i}}} ,{{\varvec{X}}}_\mathbf{3 } \right] ={{{\varvec{\delta }}}} _{{\varvec{i}}}\); furthermore,
Equation (A.10) and Lemma 4.2 provide sufficient conditions for convergence in probability of \((\tilde{{\varvec{\delta }}}_{{\varvec{Ols}}}\) and hence of) \({\widehat{\varvec{\beta }}} _{{\varvec{Ols}}} \): \(plim{\widehat{\varvec{\beta }}} _{{\varvec{Ols}}} =\mathop \sum \nolimits _{j=1}^r \pi _j {\varvec{\beta }}_{{\varvec{j}}}\). Furthermore, the estimating procedure of Kim et al. (2008) implies that \(plim\mathop \sum \nolimits _{j=1}^r {\hat{{{\pi }}}} _{\varvec{j}} {\widehat{\varvec{\beta }}}_{{\varvec{j}}} =\mathop \sum \nolimits _{j=1}^r \pi _j {\varvec{\beta }}_{{\varvec{j}}}\). \(\square \)
Appendix B: Further simulations
The simulations in Fig. 6 show the behavior of the proposed test (Cumulative Distribution Function, CDF) if a constant variance is erroneously assumed instead of a correct R-S variance. Simulations are based on the following set of parameters: \({\varvec{\beta }} _{{1}} =\left[ {21} \right] ^{{{\prime }}},{\varvec{\beta }} _{{2}} =\left[ {-2-1} \right] ^{{\prime }} \quad \pi _1 =0.42\), \(\Pr \left[ {s_t =1 {|}s_{t-1} =1} \right] =0.47\), \(\Pr \left[ {s_t =2 {|}s_{t-1} =2} \right] =0.61\). The unique regressor in \({{\varvec{X}}}_\mathbf{{1}}\) is standardized normally distributed and the error term \({{\varvec{U}}}\) is uniform-distributed over the interval \((-\,1,\,+\,1)\) in Regime 1 and over the interval \((-\,2, +\,2)\) in Regime 2.
The simulations in the following graphs consider an autoregressive term as a (unique) explanatory variable in the data-generating process. The error term is always standardized normally distributed. Three main cases are considered:
-
R-S process for the mean parameters (constant and autoregressive parameter) and the variance,Footnote 39 see Figs. 7, 8, 9, 10 and 11;
-
R-S process for the mean parameters only,Footnote 40 see Figs. 12, 13, 14, 15 and 16;
-
R-S process for the autoregressive parameter and the variance only,Footnote 41 see Figs. 17, 18, 19, 20 and 21.
Rights and permissions
About this article
Cite this article
Beccarini, A. Testing for the omission of relevant variables and regime-switching misspecification. Empir Econ 56, 775–796 (2019). https://doi.org/10.1007/s00181-017-1373-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00181-017-1373-8