1 Introduction

The correct estimation of a statistical model is of primary importance for the empirical analysis. A widespread problem concerns the Omitted Variables Bias (OVB): incorrect or biased estimates arise as long as some relevant variables are omitted from the model and, at the same time, are correlated with some regressors. McCallum (1972) was probably the first to draw attention to the OVB in the econometric analysis. However, the OVB is a problem common to all social sciences. For example, Clarke (2009) addresses this problem in political research and highlights that the typical solution of using more regressors may be erroneous. In a recent paper, Rinella et al. (2020) warn about the OVB due to missing factors in models of plant-plant interactions. Wilms et al. (2021) highlight the danger of the OVB in the psychological analysis.

In the econometric analysis, macro-econometric models, often estimated with Vector Autoregressions (VARs), are primary tools central banks and governments use to describe the economy and determine economic policy. Should the estimates be affected by the OVB, the related wrong policy may imply substantial social costs.

The typical solution to the OVB requires additional information, i.e. the omitted variable itself. Accordingly, without this additional information, the researcher is left with biased estimates. However, some progress has been made in circumventing the problem of missing information; for example, Sessions and Stevans (2006) developed a genetic algorithm to overcome the omitted variable bias under the assumption that the omitted variable is dichotomous (binary); Beccarini (2015) shows how the Kalman filter may be applied to filter omitted variables and instruments. Uemukai (2011) derives the sample properties of the ridge regression estimator when omitted variables are present.

On the other hand, it must be highlighted that the OVB is not necessarily a problem. In fact, the researcher could be interested only in the optimal forecast of the dependent variable, which does not necessarily require unbiased estimates.

Against this background, the objective of this article is to propose a test for detecting the OVB, which uses the information embedded in a basic (say OLS) and in the Maximum-Likelihood with Kalman Filter (ML-KF) estimates without employing the potential omitted variable or some proxies of this. The reader may refer to Hamilton (1994, Chap. 13), for an introduction to the ML-KF; a more comprehensive review of the topic can be found in Douc et al. (2014), in particular Chapter 2. An early survey of different uses of the KF in econometrics can be found in Schneider (1988). Some practical issues can be found in Solberger and Spånberger (2020), Kovvali et al. (2022), Strouwen et al. (2023), Chang et al. (2009), Everaert (2010), Pagan (1984) and Shumway and Stoffer (1982).

More formally, we propose a modified Hausman-type test, see Hausman (1978), where the null hypothesis refers to a (linear) model with no correlation between omitted variables and regressors, and the alternative hypothesis assumes a different-from-zero correlation between omitted variables and regressors. This modified Hausman test is derived by integrating two different Hausman tests, which consider more specific null hypotheses. In the empirical part, two illustrative examples are provided.

In the first example, the estimation of the new Keynesian Phillips curve (PC) is considered, which is a crucial component of a typical macro-econometric model; see, for example, Galí (2015). The proposed test is applied to verify the correctness of the PC as estimated in the seminal paper of Galì and Gertler (1999). Traditionally, the PC has been specified as a relationship between the current inflation rate, the output gap, and the lagged inflation rate. However, in the second half of the past century, the rational expectations revolution gained ground, initiated by the early work of Muth (1961) and corroborated by the works of Lucas (1972) and Sargent et al. (1973). This has yielded a rational expectation-augmented PC, in which a rational expectation of future inflation is able to explain a substantial part of the current inflation. However, some economists and psychologists now cast severe doubt on economic agents’ rationality assumption; see, for example Estrella and Furher (2002), and Coibion et al. (2018). Thus, there is a need to test whether, first, expectations are important in determining inflation, and second, they are indeed rational. Furthermore, even if the rational expectation hypothesis is plausible, the estimates require additional information in the form of appropriate instrumental variables, in line with the Generalized Method of Moments (GMM) based procedure.Footnote 1 These, however, may not be available to the researcher. On the other hand, due to the Kalman filter-based procedure, we neither impose this hypothesis nor require additional information. Moreover, through the proposed test we can statistically verify the two hypotheses separately.

In the second example, the proposed test and the underlying procedure are extended to VARs. In this context, the proposed test shows that the effect of monetary policy on inflation (US data) as measured by the impulse response function in the VAR must be biased. In fact, in a seminal paper, Sims (1992) showed that a (standard) VAR analysis of output, inflation rate and monetary policy interest rate exhibits a “puzzle” in the impulse response function of the inflation rate to the interest rate shock—the so-called “price puzzle”. He ascribed the presence of an omitted variable (in the standard VAR) to that problem, namely a commodity price index, because the central bank’s information set is larger than that assumed in the standard VAR. However, the literature has not reached consensus on this solution. Furthermore, Forni and Gambetti (2014) outline a test to verify whether a Factor-Augmented Vector Autoregression (FAVAR) of Bernanke et al (2005) contains sufficient information to estimate the structural shocks. Their method essentially follows the approach of the FAVAR: in fact, the researcher has to start with a “large data set” in order to find the so-called sufficient information. The main result is a test for verifying whether this information in the FAVAR is sufficient. However, there is no guarantee that in large data sets there is sufficient information.

The remainder of the article is organized as follows. In the next section, necessary assumptions and setups of the test are outlined, along with the filtering procedure and the asymptotic properties of the ML-KF and of the OLS estimators. In section three, the test is constructed. Section four extends the method to the vector autoregression context. Section five provides the empirical analysis with two illustrative examples. Conclusions then follow. Proofs, simulations, details, and some other additional materials are provided in Annexes.

2 Eliminating the omitted variable bias through Kalman filtering: the univariate case

In order to introduce the main concept of the proposed test, consider first the univariate case. Assume two alternative data-generating processes (DGP) for the dependent variable, \({q}_{t}\):

$${q}_{t}= c+{{\varvec{p}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}{\varvec{\beta}}+{\varsigma }_{t} \mathrm{t}=1,\dots ,\mathrm{N},$$
(1a)
$${q}_{t}=c+ {{\varvec{p}}}_{{\varvec{t}}}^{\mathbf{^{\prime}}}{\varvec{\beta}}+{v}_{t}{\beta }_{1}+{\varepsilon }_{t}\mathrm{ t}=1,\dots ,\mathrm{N},$$
(1b)

where c is a constant, \({{\varvec{p}}}_{{\varvec{t}}}\) is a (column) vector of k explanatory and observed variables (\({p}_{jt}\): j = 1,…,k), \({v}_{t}\) is the unobserved, explanatory (latent) variable; \({\varvec{\beta}}\) and \({\beta }_{1}\) are conformable vector and a scalar of constant parameters; \({\varepsilon }_{t}\) and \({\varsigma }_{t}\) are homoscedastic and uncorrelated error terms such that \(E\left[{\varepsilon }_{t}\right]=E\left[{\varsigma }_{t}\right]=0\); furthermore, \(E\left[{\varsigma }_{t}|{{\varvec{p}}}_{{\varvec{s}}}\right]=0\), and \(E\left[{\varepsilon }_{t}|{{\varvec{p}}}_{{\varvec{s}}}^{\boldsymbol{^{\prime}}},\boldsymbol{ }{v}_{s}\right]=0; \mathrm{t},\mathrm{s}=1,\dots ,\mathrm{ N}\), where N is the sample size. Now assume two different hypotheses characterizing the DGP of Eq. (1b):

$$Cov\left[{p}_{jt}, {v}_{t}\right]=0 \mathrm{for all j}:\mathrm{ j}=1\dots \mathrm{k};$$
(2a)
$$Cov\left[{p}_{jt}, {v}_{t}\right]\ne 0 \mathrm{for some j}:\mathrm{ j}=1\dots \mathrm{k}.$$
(2b)

The DGP of \(v\) can be specified as follows:

$${v}_{t}=\gamma +\delta {v}_{t-1}+{\epsilon }_{t} \left|\delta \right|\le 1, \mathrm{t}=1,\dots ,\mathrm{N}, $$
(3)

where \(\gamma \) and \(\delta \) are constant parameters and \({\epsilon }_{t}\) is a homoscedastic and uncorrelated error term such that \(E\left[{\epsilon }_{t}\right]=0\). Note that an alternative way to specify the model of Eq. (1b) under Eq. (2b) consists of inserting the (exogenous) vector \({{\varvec{p}}}_{{\varvec{t}}}\) in the (transition) equation of Eq. (3): \({v}_{t}=\gamma +\delta {v}_{t-1}+{{\varvec{p}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}{{\varvec{\delta}}}_{1}+{\epsilon }_{t}\), where \({{\varvec{\delta}}}_{1}\) is a conformable vector of constant parameters, see Douc et al. (2014, Chap. 2).

For simplicity of exposition only, one omitted variable is contemplated here and its DGP process [of Eq. (3)] is assumed to be a non-necessarily stationary AR(1) process. The exact functional form can be verified by means of the maximum-likelihood principles. See Harvey (1989, Chaps. 1.5, 5 and 7.5). This implies that every specification of the process of the latent variable(s) can be tested against several alternatives. For example, a likelihood ratio test can be applied, whereby the likelihoods are obtained against several alternative specifications of Eq. (3), including the generalization which considers the observable variables. See also Douc et al (2014) or Basistha and Nelson (2007) for this approach. A further (testable) generalization regards non-Gaussian and nonlinear state space models, which involve the Unscented Kalman filter and possibly the Particle filtering instead of the linear filterFootnote 2; see Douc et al (2014) for details.

The objective is to obtain a statistical test, for determining the OVB, without using the variable \(v\) (or some observed proxy of this), irrespectively of which DGP [Eq. (1a) or Eq. (1b)] is true. To this purpose, two different estimators for \({\varvec{\beta}}\) are used: the OLS estimator, say \({\widehat{{\varvec{\upbeta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}},\) and the ML-KF estimator, say \({\widehat{{\varvec{\upbeta}}}}_{{\varvec{K}}{\varvec{F}}}\). Both are based on the limited information set, \({{\varvec{I}}}_{{\varvec{N}}}=({q}_{N},{q}_{N-1},\dots ,{q}_{1}; {{\varvec{p}}}_{{\varvec{N}}}^{\boldsymbol{^{\prime}}},{{\varvec{p}}}_{{\varvec{N}}-1}^{\boldsymbol{^{\prime}}},\dots ,{{\varvec{p}}}_{1}^{\boldsymbol{^{\prime}}})\); meanwhile full information set is defined as: \({{\varvec{F}}}_{{\varvec{N}}}=\left({{\varvec{I}}}_{{\varvec{N}}}; {v}_{N},{v}_{N-1},\dots ,{v}_{1}\right)\). Note, however, that the available information may include the knowledge of Eqs. (1b) and (3), which is used for \({\widehat{{\varvec{\upbeta}}}}_{{\varvec{K}}{\varvec{F}}}\) only. On the other hand, note that \({\widehat{{\varvec{\upbeta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) is simply based on Eq. (1a) and hence is biased and inconsistent (only) if the DGP of Eq. (1b) coupled with Eq. (2b) is the true process. However, in the case it is not inconsistent [Eq. (1a) or Eq. (1b) and Eq. (2a)], it does not require complete knowledge of the DGP [in particular the specification inherent to the latent variable of Eqs. (1b) and (3)]. Annex A provides a review of the properties of the OLS and of the KF-based estimators.

2.1 The comparison of KF-based and OLS estimators under the assumed DGPs

The properties of \({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) and \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) under the several DGPs are now formalized. All proofs are provided in Annex A.2.

Lemma 1

Under the DGP of Eq. (1a) and the assumptions outlined in Annex A.1, it holds: (a) \({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) is consistent and efficient; (b) \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) is consistent, but inefficient.

Lemma 2

Under the DGP of Eqs. (1b) and (2a) and the assumptions outlined in Annex A.1, it holds: (a)\({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) is consistent and inefficient; (b) \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) is consistent and efficient.

Lemma 3

Under the DGP of Eqs. (1b) and (2b) and the assumptions outlined in Annex A.1., it holds: (a) \({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) is inconsistent and inefficient; (b) \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) is consistent and efficient.

Thus, in both hypotheses of Eqs. (2a) and (2b), \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) is the efficient estimator, see Davidson and Mackinnon (2021, Chap. 3.7): \(\underset{N\to \infty }{\mathrm{lim}}Var\left[\sqrt{N}({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}-{\varvec{\beta}})| {{\varvec{I}}}_{{\varvec{N}}}\right]\ge \underset{N\to \infty }{\mathrm{lim}}Var\left[\sqrt{N}({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}-{\varvec{\beta}})| {{\varvec{I}}}_{{\varvec{N}}}\right]\). Throughout the article, with the previous inequality, we mean that the difference between the covariance matrix of the l.h.s. and the covariance matrix of the r.h.s. is positive semidefinite; the equality sign only holds if \({{\varvec{I}}}_{{\varvec{N}}}\) does not provide any information at all about the variable \(v\). However, for sufficiently large samples and under the hypothesis of Eq. (2a), the efficiency gain related to the ML-KF estimator becomes negligible compared to the complexity of its computation (which also increases with the size of the sample). Thus, under model of Eqs. (1b) and (2a), since both estimators are consistent, the OLS estimates should be used, since the procedure is based on milder assumptions and is easier to compute.

3 Constructing the test

Now, in the light of the results of the above section, a quadratic form-based test is developed, by comparing the ML-KF estimates with the OLS estimates. To this end, two possible null hypotheses and two (coincident) alternative hypotheses are defined. Then, it will be shown how these two null hypotheses integrate into one. Thereafter, a (modified) Hausman test as originally proposed in Hausman (1978) will be proposed. All proofs are provided in Annex A.2.

3.1 The null and alternative hypotheses

In principle, given the DGPs outlined above, two possible tests can be specified, with two related pairs of (null and alternative) hypotheses.

\({H}_{\mathrm{0,1}}\): Eq. (1a); \({H}_{\mathrm{1,1}}\): Eqs. (1b) and (2b);

\({H}_{\mathrm{0,2}}\): Eqs. (1b), (2a); \({H}_{\mathrm{1,2}}\): Eqs. (1b) and (2b).

Thus, \({H}_{\mathrm{0,1}}\) refers to a DGP where there is no omitted variable, meanwhile, \({H}_{\mathrm{0,2}}\) considers the case of an omitted variable not correlated with some regressors. The coincident alternative hypotheses (\({H}_{\mathrm{1,1}}\equiv {H}_{\mathrm{1,2}}\)) refer to the case where an omitted variable is present and causes a bias in the estimates.

Theorem 1

When the null hypothesis as specified in \({H}_{\mathrm{0,1}}\) and Lemma 1 holds, then \(\sqrt{N}{(\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}-{\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}})\stackrel{d}{\to }N(0,V)\) where \(V=\underset{N\to \infty }{\mathit{lim}}Var\left(\sqrt{N}{(\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}-{\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}})\right)\); since, due to Lemma 1, \({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) is the efficient estimator, then \(Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}-{\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\right)=Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}} \right)-Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\right)\).

Theorem 2

When the null hypothesis is specified as \({H}_{\mathrm{0,2}}\), and Lemma 2 holds, then \(\sqrt{N}{(\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}-{\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}})\stackrel{d}{\to }N(0,V)\) where \(V=\underset{N\to \infty }{\mathit{lim}}Var\left(\sqrt{N}{(\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}-{\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}})\right)\); since, due to Lemma 2, \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) is the efficient estimator, then \(Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}-{\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\right)=Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}} \right)-Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\right)\).

Theorem 3

Under the unique alternative hypothesis (\({H}_{\mathrm{1,1}}\equiv {H}_{\mathrm{1,2}}\)), if Lemma 3 holds, \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) is consistent and \({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) is not. Thus, their difference does not converge to zero. Furthermore, \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) is still the efficient estimator, \(Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}-{\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\right)=Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\right)-Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\right)\).

3.2 The test

Given the specification of the two null hypotheses outlined above, two corresponding quadratic-form-based tests for testing the OVB are first proposed, which (due to the efficiency properties outlined above) take the form of a Hausman test (Hausman 1978). At this juncture, note that the above specification of the null hypotheses implies:

$${H}_{\mathrm{0,1}} \iff Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}} \right)-Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\right)<0;$$
$${H}_{\mathrm{0,2}} \iff Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}} \right)-Var\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\right)>0.$$

Thus, the relative magnitude of the variances univocally determines the underlying null hypothesis. The consideration of their (consistent) estimates might help us to decide which null is more plausible (leading to a simplification of the above structure of the hypotheses). If, however, the researcher does not aim at testing the true DGP, but only the presence of an omitted variable bias affecting \({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) (as assumed in this article), a modified Hausman test may be applied, which takes in absolute value the relevant test statistic.

Theorem 4

Based on both OLS and ML-KF estimates, define the statistic (T) as follows.

$$T=\left|{\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}} -{\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\right)\mathrm{^{\prime}}\left[\widehat{Var}\left({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}} \right)-\widehat{Var}\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\right)\right]}^{-}\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}} -{\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\right)\right|,$$
(4)

where \({\left[.\right]}^{-}\) is the generalized inverse operator, see Holly (1982). Furthermore, \(\left|.\right|\) is the absolute value operator. Then, under the null of no OVB,, i.e. under \({H}_{\mathrm{0,1}}\) or\({H}_{\mathrm{0,2}}\), T is asymptotically \({\chi }_{k+1}^{2}\) distributed:

$${H}_{0}: T\stackrel{d}{\to }{\chi }_{k+1}^{2}.$$

Corollary 1

Under some regularity conditions established in Hausman (1978), under the (unique) alternative hypothesis, the statistic of Eq. (4) is distributed as a non-central \({\chi }_{k+1}^{2}\) with non-centrality parameter as a function of the asymptotic bias, \(plim\left({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}-{\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\right)\).

Thus, the power of the test depends on the magnitude of the bias: the larger the bias, the larger the probability of rejecting the null (when it is false).

Corollary 2

Suppose that the null hypothesis at the basis of the Hausman test of Theorem 4 can be specified as \({{H}_{0}^{*}: ({{\varvec{P}}}^{\mathrm{^{\prime}}}{\varvec{P}})}^{-1}{{\varvec{P}}}^{\mathbf{^{\prime}}}{\varvec{v}}{\sqrt{N}\beta }_{1, N}=0\); where \({\beta }_{1, N}={\beta }_{1}+\frac{1}{\sqrt{N}}\theta \equiv \frac{1}{\sqrt{N}}\theta \), \({\beta }_{1}\) is defined in Eq. (1b) and \({\beta }_{1}=0\) is the restriction specified in \({H}_{\mathrm{0,1}}\), with \(\theta \) as a conformable vector of constants; thus \({\beta }_{1, N}\) defines the parameter(s) of the latent variable(s) under the local alternative. Moreover, \({\varvec{v}}\) is a (N × 1) vector: \({\varvec{v}}\equiv ({v}_{1} {v}_{2}\dots {v}_{N})\mathrm{^{\prime}}\) and \({\varvec{P}}\) is a (N x k) matrix: \({\varvec{P}}\equiv ({{\varvec{p}}}_{1}{{\varvec{p}}}_{2}\dots {{\varvec{p}}}_{{\varvec{N}}})\mathbf{^{\prime}}\). Then, (a) the null hypothesis \({{H}_{0}^{*}: ({{\varvec{P}}}^{\mathrm{^{\prime}}}{\varvec{P}})}^{-1}{{\varvec{P}}}^{\mathbf{^{\prime}}}{\varvec{v}}{\sqrt{N}\beta }_{1, N}=0\) considers all causes of the absence of the OVB as assumed in Theorem 4; (b) this Hausman test is asymptotically equivalent to the classical maximum-likelihood-based tests (that is the asymptotically equivalent Wald, Lagrange-Multiplier and the Likelihood Ratio tests).

Corollary 3

For N sufficiently large and as long as the non-centrality parameter of the chi-squared distribution, as defined as in Corollary 1, is not sufficiently close to zero, the power of the test of Theorem 4 is close to one.

Note that if the test statistic T leads to not rejecting the null hypothesis, then the researcher should conclude that both \({\widehat{{\varvec{\beta}}}}_{{\varvec{O}}{\varvec{L}}{\varvec{S}}}\) and \({\widehat{{\varvec{\beta}}}}_{{\varvec{K}}{\varvec{F}}}\) are consistent. Instead of the efficiency properties (which, in principle might help to choose between them), the criterion of the ease of computation can be used, which always leads to choosing the OLS estimates. Simulations about the performance of the test are provided in Annex B.

4 Extending the method to a multivariate context

When the true DGP is represented by a (structural) VAR, the models of Sect. 2 require some modification. But first, the following remarks are necessary.

Remark 1

With Eqs. (1b) and (3), and under both DGPs [Eqs. (2a) and (2b)], the researcher immediately has a valid expression in order to both filter the latent variables and estimate the relevant parameters. In a VAR context, the test must be applied to the estimates of structural parameters only (to verify the correctness of the IRFs). Note that, in this context, for ease of exposition, we exclude the case of no omitted variable, as in Eq. (1a).

Remark 2

The usual issues related to a VAR context may arise, such as the ordering of the VAR and the condition under which exact identification is possible. In fact, in general terms, the analysis should be performed given an identification scheme. Fortunately, the specification of the null hypothesis (on which the proposed test is based) places several restrictions on the ordering of the VAR (based on ML-KF estimates of the unobservable variables); further details are provided below. This section now clarifies how to extend the general method (filtering and testing) to a multivariate context through the following steps:

  1. A)

    specification of the structural-form and reduced-form VARs under both the hypotheses of some and no statistical relationship between the observable and unobservable variables;

  2. B)

    filtering of the unobservable variables through the reduced-form VAR (based on ML-KF estimates of the unobservable variables), estimation of the relevant parameters;

  3. C)

    computation of the test on the estimated structural parameters.

4.1 Specifying the structural VARs

We now generalize the DGP described by Eqs. (1b) and (3). For this purpose, define \({{\varvec{x}}}_{{\varvec{t}}}\) as a (n × 1)-vector of observable variables and define \({{\varvec{y}}}_{{\varvec{t}}}\) as a ([n + m] × 1)-vector including both \({{\varvec{x}}}_{{\varvec{t}}}\) and the (m × 1)-vector of unobservable variables, \({{\varvec{\nu}}}_{{\varvec{t}}}\): \({{\varvec{y}}}_{{\varvec{t}}}=[{{\varvec{\nu}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}\boldsymbol{ }\boldsymbol{ }{{\varvec{x}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}]\boldsymbol{^{\prime}}.\) For the time being, \({{\varvec{y}}}_{{\varvec{t}}}\) is not meant to have a particular ordering. Consider the related structural-form VAR(r):

$${{\varvec{B}}}_{0}{{\varvec{y}}}_{{\varvec{t}}}={\varvec{k}}+{{\varvec{B}}}_{1}{{\varvec{y}}}_{{\varvec{t}}-1}+\dots +{{\varvec{B}}}_{{\varvec{r}}}{{\varvec{y}}}_{{\varvec{t}}-{\varvec{r}}}+{{\varvec{u}}}_{{\varvec{t}}}.$$
(5)

k is the ((m + n) × 1)-vector of constants and the ((m + n) × 1)-vector \({{\varvec{u}}}_{{\varvec{t}}}\) represents the structural disturbances with covariance matrix \({E[{\varvec{u}}}_{{\varvec{t}}}{{\varvec{u}}\boldsymbol{^{\prime}}}_{{\varvec{t}}}]\). Furthermore, it is convenient to generalize the process-related Eq. (3) as an autonomous structural VAR(r):

$${{\varvec{H}}}_{0}{{\varvec{\upnu}}}_{\mathbf{t}} ={\varvec{w}}+{{\varvec{H}}}_{1}{{\varvec{\upnu}}}_{\mathbf{t}-1}+\dots +{{\varvec{H}}}_{{\varvec{r}}}{{\varvec{\upnu}}}_{\mathbf{t}-\mathbf{r}}+{{\varvec{u}}}_{{\varvec{t}}}^{{\varvec{\nu}}}.$$
(6)

\({\varvec{w}}\) is the (m × 1)-vector of (unidentified) constants and \({{\varvec{u}}}_{{\varvec{t}}}^{{\varvec{\nu}}}\) is the (m × 1)-vector of structural disturbances. We now generalize the underspecified model of Eq. (1a) from which OLS estimates are obtained. The corresponding structural VAR(r) is:

$${{\varvec{A}}}_{0}{{\varvec{x}}}_{{\varvec{t}}}={\varvec{c}}+{{\varvec{A}}}_{1}{{\varvec{x}}}_{{\varvec{t}}-1}+\dots +{{\varvec{A}}}_{{\varvec{r}}}{{\varvec{x}}}_{{\varvec{t}}-{\varvec{r}}}+{{\varvec{\epsilon}}}_{{\varvec{t}}}.$$
(7)

\({{\varvec{\epsilon}}}_{{\varvec{t}}}\) is a (n × 1)-vector of structural disturbances. Since \({{\varvec{B}}}_{0}\) is now a ([m + n] × [m + n])-matrix, we assume that there are sufficient theoretical restrictions such that \({{\varvec{B}}}_{0}\) is a lower triangular matrix with unit coefficients along the principal diagonal. This corresponds to a recursive structure where the first (m + n − 1) variables of \({{\varvec{y}}}_{{\varvec{t}}}\) respond with a lag to change in the last variable. Coupled with the assumption that \({E[{\varvec{u}}}_{{\varvec{t}}}{{\varvec{u}}\boldsymbol{^{\prime}}}_{{\varvec{t}}}]\) is diagonal, the necessary conditions for the exact identification are verified (this is the order condition, the rank condition is verified empirically). Thus, assuming this ordering: \({{\varvec{y}}}_{{\varvec{t}}}=[{{\varvec{\nu}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}\boldsymbol{ }\boldsymbol{ }{{\varvec{x}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}]\boldsymbol{^{\prime}}\), the first m rows represent the m relations pertaining to vector \({{\varvec{\nu}}}_{{\varvec{t}}}\) and the last n rows represent the n relations arising between the variables of \({{\varvec{y}}}_{{\varvec{t}}}\). All these conditions imply that \({{\varvec{B}}}_{0}\) is specified as follows:

$${{\varvec{B}}}_{0}=\left[\begin{array}{cc}{{\varvec{H}}}_{0}& 0\\ {{\varvec{U}}}_{0} & {{\varvec{A}}}_{{\varvec{K}}}\end{array}\right],$$
(8)

where \({{\varvec{A}}}_{{\varvec{K}}}\) is an (n x n) lower triangular matrix of structural parameters,Footnote 3\({{\varvec{H}}}_{0}\) is an (m × m) lower triangular matrix, both with unities on the main diagonal, \({{\varvec{U}}}_{0}\) is a (n × m) matrix of structural (unidentified) parameters and 0 is a (m × n) matrix of zeros.

4.2 The specification of the null hypothesis and the ordering of the VAR

The null hypothesis of no OVB places further restrictions on the above VAR; in fact, now specifying \({{\varvec{x}}}_{{\varvec{t}}}\) as \({{\varvec{x}}}_{{\varvec{t}}}:[{{\varvec{p}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}\boldsymbol{ }{{\varvec{q}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}]\boldsymbol{^{\prime}}\) where \({{\varvec{q}}}_{{\varvec{t}}}\) is a (h × 1) vector, \({{\varvec{p}}}_{{\varvec{t}}}\) is an ([n − h] × 1) vector; then:

  1. 1)

    Cov[\({p}_{it},{\nu }_{jt}]=0\), for all i and j, i = 1,…,(n − h); j = 1,…,m;

  2. 2)

    \({{\varvec{y}}}_{{\varvec{t}}}=[{{\varvec{\nu}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}\boldsymbol{ }\boldsymbol{ }{{\varvec{x}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}]\boldsymbol{^{\prime}}\) and \({{\varvec{x}}}_{{\varvec{t}}}:{\left[{{\varvec{p}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}\boldsymbol{ }{{\varvec{q}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}\right]}^{\boldsymbol{^{\prime}}}={\left[{{\varvec{p}}}_{{\varvec{t}}}^{\boldsymbol{^{\prime}}}\boldsymbol{ }{q}_{t}\right]}^{\boldsymbol{^{\prime}}}\), that is \({{\varvec{q}}}_{{\varvec{t}}}\) is a single variable (h = 1) positioned at the bottom of the vector \({{\varvec{y}}}_{{\varvec{t}}}\); this ordering of \({{\varvec{y}}}_{{\varvec{t}}}\) is meant in a strict sense (note that in both hypotheses, Cov[\({q}_{t},{\nu }_{jt}]\ne 0\) for all j, j = 1,…,m).

Thus, the correlation between some variables of \({{\varvec{p}}}_{{\varvec{t}}}\) and \({{\varvec{\nu}}}_{{\varvec{t}}}\) is excluded by the usual condition (1); in fact this implies that the parameters of the first (n − 1) rows of \({{\varvec{U}}}_{0}\) of Eq. (8) are 0 and the parameters of the nth row are different from 0.

The specification about \({{\varvec{y}}}_{{\varvec{t}}}\) of condition (2) and the fact that \({{\varvec{B}}}_{0}\) is assumed to be a lower triangular matrix ensures that only one variable,\({q}_{t}\), is (potentially) correlated with all remaining variables (\({{\varvec{p}}}_{{\varvec{t}}}\) and \({{\varvec{\nu}}}_{{\varvec{t}}}\)). Thus, fortunately, the null hypothesis places (several) restrictions on the ordering of \({{\varvec{y}}}_{{\varvec{t}}}\) [condition (2)] and hence mitigates the usual problem of exogeneity restrictions required in the VAR analysis. Note now that under the null hypothesis, it holds: \({{\varvec{A}}}_{{\varvec{K}}}={{\varvec{A}}}_{0}\) and hence, the corresponding (consistent) estimates must converge. If the test leads to rejecting the null (and hence there is no indication at all about the ordering of the VAR) the researcher may also find another ordering to carry the analysis forward.

4.3 The estimation of reduced-form VAR with filtered variables and of the structural parameters

The following statements summarize the estimation procedure and the construction of the proposed test. Details are provided in Annex A.3. A two-step estimating approach tackling with observables and unobservable variables is proposed. It is based on a state-space specification, which corresponds to the reduced-form VAR (based on ML-KF estimates of the unobservable variables). This enables obtaining the filtered variables \(\widehat{{\varvec{\nu}}}\) and estimating the relevant parameters.

Statement 1 Define as \(\boldsymbol{\Omega }\) the covariance matrix of errors terms in the reduced-form KFVAR. The computation of \(\widehat{\boldsymbol{\Omega }}\) and its Cholesky decomposition yields the estimate of the matrix of structural parameters \({\widehat{{\varvec{B}}}}_{0}\) [consistently with Eq. (8)], from which \({\widehat{{\varvec{A}}}}_{{\varvec{K}}}\) is obtained. It holds:

$$\widehat{\boldsymbol{\Omega }}=\left[\begin{array}{cc}{\widehat{{\varvec{Q}}}}_{{\varvec{K}}{\varvec{F}}}& {\widehat{{\varvec{Q}}}}_{{\varvec{K}}{\varvec{F}}}{{\varvec{N}}}_{0}^{\mathrm{^{\prime}}}\\ {{{\varvec{N}}}_{0}\widehat{{\varvec{Q}}}}_{{\varvec{K}}{\varvec{F}}}& {\widehat{{\varvec{R}}}}_{{\varvec{K}}{\varvec{F}}}+{{\varvec{N}}}_{0}{\widehat{{\varvec{Q}}}}_{{\varvec{K}}{\varvec{F}}}{{\varvec{N}}}_{0}^{\mathrm{^{\prime}}}\end{array}\right],$$
(9)

where the details of \(\boldsymbol{\Omega }\) are provided in (A.7), \({\widehat{{\varvec{R}}}}_{{\varvec{K}}{\varvec{F}}}\), \({\widehat{{\varvec{Q}}}}_{{\varvec{K}}{\varvec{F}}}\) are the estimated covariance matrices of the state-space model of Eqs. (A.2) and (A.4), and \({{\varvec{N}}}_{0}\) is the matrix of the unidentified coefficients of \({{\varvec{\nu}}}_{{\varvec{t}}}\) in the observation equation (A.2). Thus, from the Cholesky decomposition of \(\widehat{\boldsymbol{\Omega }}\), the covariance matrix of the parameters of \({\widehat{{\varvec{A}}}}_{{\varvec{K}}}\) is derived.

Statement 2 Define as \({\boldsymbol{\Omega }}_{0}\) the covariance matrix of reduced-form VAR of \({{\varvec{x}}}_{{\varvec{t}}}\) [corresponding to the structural-form VAR of Eq. (7)]; then from the Cholesky decomposition of \({\widehat{\boldsymbol{\Omega }}}_{0}\), \({\widehat{{\varvec{A}}}}_{0}\) is obtained.

Statement 3 Define \({{\varvec{\theta}}}_{0}\) and \({{\varvec{\theta}}}_{{\varvec{K}}}\) as the ([n − 1]n/2 × 1)-vectors collecting the parameters to be estimated of the matrices \({{\varvec{A}}}_{0}\) and \({{\varvec{A}}}_{{\varvec{K}}}\), respectively. Under the null of no OVB, the asymptotic distribution of \({\widehat{{\varvec{\theta}}}}_{0}\) and \({\widehat{{\varvec{\theta}}}}_{{\varvec{K}}}\) are:

$$\sqrt{N}\left({\widehat{{\varvec{\theta}}}}_{0 }-{{\varvec{\theta}}}_{0 }\right)\to N\left(0,N\bullet Var\left({\widehat{{\varvec{\theta}}}}_{0 }\right)\right);$$
(10)
$$\sqrt{N}\left({\widehat{{\varvec{\theta}}}}_{{\varvec{K}} }-{{\varvec{\theta}}}_{{\varvec{K}} }\right)\to N\left(0,N\bullet Var\left({\widehat{{\varvec{\theta}}}}_{{\varvec{K}} }\right)\right).$$

Under \({H}_{\mathrm{0,1}}\), \(N\bullet Var\left({\widehat{{\varvec{\theta}}}}_{0}\right)\) is defined in Eq. (A.9) and \(N\bullet Var\left({\widehat{{\varvec{\theta}}}}_{\mathbf{K}}\right)\) is defined similarly. As for \({H}_{\mathrm{0,2}}\), details are provided in Annex A.3, details of Statement 3.

Thus, the test aims at verifying whether \(\left({\widehat{{\varvec{\theta}}}}_{0\boldsymbol{ }}-{\widehat{{\varvec{\theta}}}}_{{\varvec{K}}}\right)\) is statistically significant.

Statement 4 In line with Theorem 4, the (modified Hausman) test is defined as:

$$T=\left|{\left({\widehat{{\varvec{\theta}}}}_{0 }-{\widehat{{\varvec{\theta}}}}_{{\varvec{K}} }\right){\prime}\left[\widehat{Var}\left({\widehat{{\varvec{\theta}}}}_{0 }\right)-\widehat{Var}\left({\widehat{{\varvec{\theta}}}}_{{\varvec{K}} }\right)\right]}^{-}\left({\widehat{{\varvec{\theta}}}}_{0 }-{\widehat{{\varvec{\theta}}}}_{{\varvec{K}} }\right)\right|.$$
(11)

Note that the degrees of freedom of the relevant \({\chi }^{2}\) distribution (according to which the test is distributed) is equal to the number of parameters embedded in \({{\varvec{A}}}_{0}\) and \({{\varvec{A}}}_{{\varvec{K}}}\). Thus, as they are triangular matrices, the number of parameters is then n(n − 1)/2.

Annex B.2 investigates the small sample properties through simulations of the proposed test when the VAR with filtered variables is estimated (in line with the null hypothesis). Further details about the computation of the test are provided in Annex C.

5 Two illustrative examples

This section provides two examples in which the test can be applied. The first example deals with a univariate regression; afterward the VAR context is considered.

5.1 Estimating the Phillips curve

In a seminal paper, Galì and Gertler (1999) aim at estimating the rational expectations-augmented Phillips curve (PC) for US data over the period 1960q1–1997q4. It takes the following form:

$${\pi }_{t}=\alpha {E}_{t}\left[{\pi }_{t+1}\right]+\beta {mc}_{t}+\gamma {\pi }_{t-1},$$
(12)

where \({\pi }_{t}\) is the inflation rate of Period t, \({E}_{t}\left[{\pi }_{t+1}\right]\) is the expectation component of the inflation rate at Period t + 1 that agents form at Period t, based on all publicly available information. The “real economy variable” (\({mc}_{t}\)) is the (log) labor income share in the non-farm business sector; this represents a measure of the real marginal costs (MC) in the economy. The inflation rate is measured by the percent change in the GDP non-farm deflator. The expectations variable is unobservable. Galì and Gertler (1999) have overcome this fact by assuming that the market forms expectations about the inflation rate in a rational manner. Thus, a convenient way to estimate them is using the Generalized Method of Moments (GMM), see Hamilton (1994, Chap. 14).Footnote 4 This procedure is based on the orthogonality conditions between the error term of the PC regression and some instrumental variables. This is the direct consequence of assuming rationality in the expectations. As the rational agents exploit all relevant information in the expectation formation, the error term associated with these expectations must be orthogonal (uncorrelated) to any other variable evaluated at Period t. To this end, Galì and Gertler (1999) used several variables as instruments, such as four lags of inflation, the labor income share, the output gap, the long-short interest rate spread, wage inflation, and commodity price inflation.

They defined \({mc}_{t}\) as the “forcing variable” since it represents the economic activity, and the related parameter \(\beta \) gauges how much the economic activity affects the inflation rate. This quantity is relevant for the monetary policy, as policymakers may affect the aggregate activity in order to steer the inflation rate. Thus, the correct estimate of \(\beta \) is of primary importance. For the considered sample, Galì and Gertler (1999) obtained estimates for \(\beta \), ranging between 0.015 and 0.047. However, their baseline model exhibits an estimate of \({\widehat{\beta }}_{GMM}=0.037\) (0.007).Footnote 5 Furthermore, the authors declare the data support the rational expectation-augmented PC as the estimation for \(\alpha \) is significant at any conventional level.

At this juncture, we highlight some problems related to these conclusions. One shortcoming of the GMM-based approach is the validity of the instruments: what the econometrician may observe and hence use as an instrument is, in general terms, only a subset of the information used by the economic agents. Furthermore, the rationality assumption at the basis of the orthogonal conditions may not be valid, see for example Coibion et al. (2018).

Suppose now that the main objective of the analysis is to consistently estimate \(\beta \), the parameter related to the MC. Again, this is of primary interest to policymakers.

Our approach firstly consists of estimating the PC based on Kalman filtering any potential omitted variable, including the (rational) expectation component. Secondly, we perform the proposed test to verify which estimation of the PC provides unbiased estimates. Thus, we compare the estimates of the KF-, GMM-, and OLS-based PCs.

Estimating the PC of Eq. (12) by means of the KF has several advantages.

First, we do not have to take a stand about whether expectations are relevant and how they are formed: we simply filter any potential omitted variables.

Second, we do not have to specify the potential omitted variables in an exact manner: all omitted variables are merged into the state variable. Note that, as pointed out by Galì and Gertler (1999), it is plausible that the log of labor income share, which is the MC measure in the PC regression, is only a proxy of the real marginal costs, that is the true value affecting the inflation rate as postulated by the theory. These measurement errors may be considered as an omitted variable, and hence its neglection may cause a bias.

Third, we do not have to find the relevant instruments: our data set includes only the inflation and the MC measures. Thus, the KF-based procedure is particularly convenient when expectations are rational, as they may be based on a wider information set.Footnote 6 In order to find the suitable state-space model, we first consider the more general specification of the transition and state equations:

$${\pi }_{t}={\upsilon }_{t}+\beta {mc}_{t}+{\gamma }_{1}{\pi }_{t-1}+{\gamma }_{2}{\pi }_{t-2}+{\varepsilon }_{t},$$
(13)
$${\upsilon }_{t}=c+\delta {\upsilon }_{t-1}+{\rho }_{0}{mc}_{t}+{\rho }_{1}{mc}_{t-1}+{\rho }_{2}{mc}_{t-2}{+\epsilon }_{t},$$
(14)

where \({\upsilon }_{t}\) is the state variable, \({\varepsilon }_{t}\) and \({\epsilon }_{t}\) are error terms with zero mean, constant variance, and not correlated with any other quantity of the system. The correct specification requires that \({\gamma }_{2}={={\rho }_{2}=\rho }_{0}=0.\) This specification has been obtained by appropriately selecting regressors in both equations by means of t-tests and maximum-likelihood-based criteria.Footnote 7 The estimated model isFootnote 8:

$${\widehat{\pi }}_{t}={\widehat{\upsilon }}_{t}+{0.047}^{**}{mc}_{t}+{0.94}^{***}{\pi }_{t-1},$$
(15)
$${\widehat{\upsilon }}_{t}=-{0.26}^{**}{\widehat{\upsilon }}_{t-1}-{0.06}^{**}{mc}_{t-1}.$$
(16)

Estimates first show that the KF is able to obtain an estimate of the parameter related to the MC, which is similar to that obtained by Galì and Gertler (1999). Thus, without using any further information other than the GDP deflator and the MC, we obtain: \({\widehat{\beta }}_{KF}=0.047\) (0.002). The OLS estimates are also based on Eq. (12) after having substituted the expectation component with a constant:

$${\pi }_{t}=c+\beta {mc}_{t}+\gamma {\pi }_{t-1}+{\varsigma }_{t}.$$
(17)

The OLS estimate for \(\beta \) is \({\widehat{\beta }}_{OLS}=0.012\) (0.012). This standard error is computed in line with the two possible null hypotheses as outlined in Sect. 3.1 and Annex A.1.Footnote 9 Now, three alternative estimates for \(\beta \) are available. We are now able to test whether the KF-based estimated PC outperforms the alternative estimations, which are the following.

(a) The OLS-based estimation, as obtained from Eq. (17). This test aims at verifying whether estimating the PC without any expectation component provides unbiased estimates.

(b) The GMM-based estimation of Galì and Gertler (1999). This test aims at verifying whether estimating the PC with an expectation component and with the instruments used by the authors provides unbiased estimates. Note that this is now the benchmark estimator instead of the OLS estimator.

Thus, we first formulate the null hypothesis of no omitted variables bias in\({\widehat{\beta }}_{OLS}\), in line with Sect. 3. As computed as in Eq. (4), the T statistics is 8.75. Since the relevant chi-square distribution has one degree of freedom, the 1%-critical level is 6.63. The Hausman test strongly rejects the hypothesis that \({\widehat{\beta }}_{OLS}\) and \({\widehat{\beta }}_{KF}\) are statistically equal (i.e.\(plim\left({\widehat{\beta }}_{KF} -{\widehat{\beta }}_{OLS}\right)=0\)), and hence, the null hypothesis of no bias in \({\widehat{\beta }}_{OLS}\) is rejected at any conventional level. Thus, we draw the following conclusions. First, the expectation formation as an explanatory variable is important, and its potential omission does cause a bias. However, there is no guarantee that it is the only relevant omitted variable, as potential measurement errors may be present.

Secondly, we first formulate the null hypothesis of no omitted variables bias in \({\widehat{\beta }}_{GMM}\), in line with Sect. 3. The T statistics, as computed in Eq. (4), is 2.22. Since the relevant chi-square distribution has one degree of freedom, the 10%-critical level is 2.71. The Hausman test does not reject the hypothesis that \({\widehat{\beta }}_{GMM}\) and \({\widehat{\beta }}_{KF}\) are statistically equal at any conventional level, and hence, the null hypothesis of no bias in \({\widehat{\beta }}_{GMM}\) cannot be rejected. However, there are reasons to suppose that \({\widehat{\beta }}_{KF}\) is more reliable than \({\widehat{\beta }}_{GMM}\), for the following reasons.

First, as our point estimate for \(\beta \) is higher than that obtained by Galì and Gertler (1999), there is a presumption that \({\widehat{\beta }}_{GMM}\) is biased downward. In fact, as the authors point out, the presence of omitted variables leads to underestimate \(\beta \), thus our KF-based estimate appears to be more reliable in this sense. The further bias present in \({\widehat{\beta }}_{GMM}\) but plausibly not in \({\widehat{\beta }}_{KF}\) may be due to the measurement errors in the MC measure, which again implies a negative bias, see Greene (2012, Chap. 4).

Second, the KF-based estimate of the PC requires less structure for the expectation formation. Even if we interpret the state variable as only an expectation component, we do not have to assume that this expectation is rational. Rational expectations require that all available information is exploited to predict the inflation rate. This seems to be not the case when we consider the estimate of state equation. In fact, estimates outlined in Eq. (16) show that the MC affects the state variable only with a lag. This result is, however, in contrast with the rational expectation hypothesis. In order to show this, Galì and Gertler (1999) constructed the so-called “fundamental inflation rate” which, under rational expectations, equals the discounted expected flow of future MC. Given that the time series of MC is highly persistent,Footnote 10 rational agents must use the current value of MC as a predictor for the future development of the inflation rate. This, however, is not the case. Thus, inflation formations are important in the specification of the PC, but there is no sufficient evidence that they are formed rationally, and this may have jeopardized the GMM estimates of Galì and Gertler (1999). Moreover, our conclusion reconciles with the stylized fact that disinflation is costly for the economy, as documented by the authors. The costly disinflation is at odds with rational expectation assumption: it is consistent with adaptive expectations and/or with inflation inertia, see Carlin and Soskice (2006, Chap. 3).

In sum, the proposed test and the related procedure lead us to strongly reject the traditional Phillips curve (without the expectation component). However, it does not corroborate the fact that expectations formations are (completely) rational.

5.2 Omitted variables in the VAR analysis and the price puzzle

We consider the dataset of quarterly data from Giordani (2004) starting from 1966q1 to 2001q3 and its extension. Quarterly data are more appropriate with respect to monthly data in that they are less affected by measurement errors. In fact, Giordani (2004) showed that measurement errors are less severe with quarterly data, as they accumulate on a quarterly basis and tend to go to zero by the law of the large numbers. Now, with this result in hand, structural estimates are performed and the proposed test is applied. Firstly, x = [GDP INF FF]′ is considered (because the Federal Reserve exploits all available information that arises on a quarterly basis) and from this VAR, \({\widehat{{\varvec{A}}}}_{0}\) and hence \({\widehat{{\varvec{\theta}}}}_{0}\) are obtained. The ordering of the observed variables x in the KFVAR is assumed to be the same, and hence, y = [V1 V2 V3 GDP INF FF]′. Thus \({\widehat{{\varvec{B}}}}_{0}\) is obtained, which includes \({\widehat{{\varvec{A}}}}_{{\varvec{K}}}\) from which \({\widehat{{\varvec{\theta}}}}_{{\varvec{K}}}\) is derived. The proposed test yields T = 12.0, and Annex C shows the details of this calculation. Under the null, the test statistic is chi-squared distributed with three degrees of freedom whose 0.99-quantile is 11.3. Thus, the null hypothesis of no bias in the VAR with x = [GDP INF FF]′ is rejected at the usual significance levels. This test is obtained under \({\mathrm{H}}_{\mathrm{0,1}}\) because, according to the quoted literature, it is implausible to have (omitted) regressors, which are uncorrelated with measures of output and prices, and this notwithstanding, are capable of affecting the measure of monetary policy. Anyway, throughout the article, all values of the test are also always computed in line with H0,2, but conclusions are the same at the declared critical levels. To this purpose, a VAR-based generalization proposed in Newey and West (1987) is applied. Details are provided in Annex A.3.

Thus, the method outlined in this article has the following advantage. It robustly detects the presence of omitted variables in the sense that, in contrast to traditional tests, it does not depend on the availability of the omitted variable or of some proxy thereof.

With reference to the analysis of Castelnuovo and Surico (2010), the test and the related procedure have been applied to the period before and after 1979q4 (in which the FED-chairman Volcker came into office). The test is applied to the (extended) Giordani data set (note that Castelnuovo and Surico 2010 used the GDP deflator as a measure of the inflation rate): on the sample 1955q2–1979q3, the test statistic is: T = 23.5 and on the sample 1979q4–2015q2, the test statistic is: T = 18.4. Thus, the null hypothesis of no omitted variables is rejected in both samples. At this juncture, note that Castelnuovo and Surico (2010) excluded the presence of omitted variable bias after 1979q4 by simply “looking” at the IRFs, we have instead detected through a formal test. This was the same approach of Bernanke et al. (2005).

6 Conclusion

Throughout this article, we have shown how to perform a quadratic-form-based test to verify the presence of omitted variables bias by comparing estimates obtained by a basic procedure (say OLS) with Maximum-likelihood with Kalman filter-based (ML-KF) estimates. This test does not require knowledge of the omitted variable or some proxy of this. Formally, the test considers a linear model with some observable and unobservable variables. In the null hypothesis, unobserved variables are correlated with the dependent (observed) variable only, and no unobserved variable is correlated with the observed exogenous variables. Under the alternative hypothesis, some unobserved variables are also correlated with at least one observed (exogenous) variable. Under the null, both OLS and ML-KF estimates are consistent (and hence their difference is not statistically non-zero). Due to its quadratic form, the derived test is (asymptotically) chi-squared distributed. This univariate framework is extended to a VAR context. In this case, the test verifies whether the structural estimates of the OLS-based VAR and of the ML-KF-based VAR are statistically different, which detects the presence of the OVB. The empirical part contains two illustrative examples. First, the estimate of the rational expectations-augmented Phillips curve is considered. Through the proposed test and the related procedure, it has been shown that expectations are an important component for explaining the current inflation. However, there is no evidence that they are formed rationally. The second example dealt with the estimate of a VAR including output, inflation rate and federal funds rate (US data). It is shown that the testing procedure proposed in this article is able firstly to robustly establish the presence of three omitted variables affecting the estimates of this VAR, causing the so-called “price puzzle”.