Advertisement

A Monte Carlo Study of Time Varying Coefficient (TVC) Estimation

  • Stephen G. HallEmail author
  • Heather D. Gibson
  • G. S. Tavlas
  • Mike G. Tsionas
Open Access
Article
  • 185 Downloads

Abstract

A number of recent papers have proposed a time-varying-coefficient (TVC) procedure that, in theory, yields consistent parameter estimates in the presence of measurement errors, omitted variables, incorrect functional forms, and simultaneity. The key element of the procedure is the selection of a set of driver variables. With an ideal driver set the procedure is both consistent and efficient. However, in practice it is not possible to know if a perfect driver set exists. We construct a number of Monte Carlo experiments to examine the performance of the methodology under (i) clearly-defined conditions and (ii) a range of model misspecifications. We also propose a new Bayesian search technique for the set of driver variables underlying the TVC methodology. Experiments are performed to allow for incorrectly specified functional form, omitted variables, measurement errors, unknown nonlinearity and endogeneity. In all cases except the last, the technique works well in reasonably small samples.

Keywords

Time-varying coefficients Specification errors Monte Carlo 

JEL Classification

C130 C190 C220 

1 Introduction

A series of papers have proposed the use of time-varying coefficient (TVC) models to uncover the bias-free estimates of a set of model coefficients in the presence of omitted variables, measurement error and an unknown true functional form. 1 There have also been a reasonably-large number of successful applications of the technique.2 However, it is difficult to establish the usefulness of a technique strictly through applications since we can never be certain of the accuracy of the results. This paper attempts to bridge the gap between the asymptotic theoretical results of the theoretical papers and the apparently good performance of the applied papers by constructing a set of Monte Carlo experiments to examine (1) how well the technique performs under clearly-defined conditions and (2) the limits on the technique’s ability to perform successfully under a broad range of model misspecifications.

The technique is motivated by an important theorem that was first proved by Swamy and Mehta (1975) and has recently been confirmed by Granger (2008) who quoted a proof that he attributed to Hal White. This theorem states that any nonlinear function may be exactly represented by a linear relationship with time-varying parameters. The importance of this theorem is that it allows us to capture an unknown true functional form in this framework. The parameters of this time-varying-coefficient model are, of course, not consistent estimates of the true functional form since they will be contaminated by the usual biases due to omitted variables, measurement error and simultaneity. The technique being investigated here allows us, in principal, to decompose the TVCs into two components; we associate the first component with the true nonlinear structure, which we interpret as the derivative of the dependent variable with respect to each of the independent variables in the unknown, nonlinear, true function; we associate the second component with the biases emanating from misspecification, and which we then remove from the TVC to give us our consistent estimates. Potentially, this technique offers an interesting way forward in dealing with model misspecification. It has generally been applied in a time series setting but it can equally well be interpreted as a cross section3 or panel estimation technique.

The remainder of this paper is structured as follows. Section 2 outlines the basic (TVC) theoretical framework. Section 3 discusses some computational issues associated with estimating the model. Section 4 reports on a series of Monte Carlo experiments. Section 5 concludes. An “Appendix” provides details on the computational methods used in the Monte Carlo simulations.

2 The Theoretical Framework

We follow Swamy et al. (2010) who set the groundwork for uncovering causal economic laws. We assume
$$ {\text{y}}_{\text{t}}^{ *} {\text{ = f}}\left( {{\mathbf{x}}_{\text{t}}^{ *} ,{\mathbf{e}}_{\text{t}}^{ *} } \right), $$
(1)
where \( {\mathbf{x}}_{\text{t}}^{*},{\mathbf{e}}_{\text{t}}^{*} \) are the true determinants of \( {\text{y}}_{\text{t}} \). Alternatively we can represent this relationship by
$$ {\text{y}}_{\text{t}}^{*} = \upalpha_{{0{\text{t}}}} + {\boldsymbol{\upalpha}}^{\prime }_{{1{\text{t}}}} {\mathbf{x}}_{\text{t}}^{*} + {\boldsymbol{\upalpha}}^{\prime }_{{2{\text{t}}}} {\mathbf{e}}_{\text{t}}^{*} $$
(2)
We have the auxiliary equations:
$$ {\mathbf{e}}_{\text{t}}^{*} = {\boldsymbol{\Psi}}_{\text{t}} {\mathbf{x}}_{\text{t}}^{*} + {\mathbf{v}}_{\text{t}}^{*} $$
(3)
Substituting (3) into (2) gives:
$$ {\text{y}}_{\text{t}}^{*} = \left( {\upalpha_{{0{\text{t}}}} + {\boldsymbol{\upalpha}}^{\prime }_{{2{\text{t}}}} {\mathbf{v}}_{\text{t}}^{*} } \right) + \left( {{\boldsymbol{\upalpha}}^{\prime }_{{1{\text{t}}}} + {\boldsymbol{\upalpha}}^{\prime }_{{2{\text{t}}}} {\boldsymbol{\Psi}}_{\text{t}} } \right){\mathbf{x}}_{\text{t}}^{*} . $$
(4)
To deal with errors in variables we assume:
$$ {\text{y}}_{\text{t}} = {\text{y}}_{\text{t}}^{*} + {\text{v}}_{\text{t}} , $$
(5)
$$ {\mathbf{x}}_{\text{t}} = {\mathbf{x}}_{\text{t}}^{*} + {\mathbf{w}}_{\text{t}} . $$
(6)
Substituting into (4), we obtain:
$$ \begin{aligned} {\text{y}}_{{\text{t}}} & = \left( {{\upalpha }_{{0{\text{t}}}} + {\boldsymbol{\upalpha }}_{{2{\text{t}}}}^{\prime } {\mathbf{v}}_{{\text{t}}}^{*} + {\text{v}}_{{\text{t}}} } \right) + \left( {{\boldsymbol{\upalpha }}_{{1{\text{t}}}}^{\prime } + {\boldsymbol{\upalpha }}_{{2{\text{t}}}}^{\prime } {\boldsymbol{\Psi }}_{{\text{t}}} } \right)\left( {{\text{I}} - {\text{D}}_{{{\text{wt}}}} {\text{D}}_{{{\text{xt}}}}^{{ - 1}} } \right){\mathbf{x}}_{{\text{t}}} \\ & = {\upbeta }_{{0{\text{t}}}} + {\boldsymbol{\upbeta }}_{{{\text{xt}}}}^{\prime } {\mathbf{x}}_{{\text{t}}} \equiv {\mathbf{x}}_{{{\text{et}}}}^{\prime } {\boldsymbol{\upbeta }}_{{\text{t}}} , \\ \end{aligned} $$
(7)
where \( \upbeta_{{0{\text{t}}}} = \upalpha_{{0{\text{t}}}} + {\boldsymbol{\upalpha}}^{\prime }_{{2{\text{t}}}} {\mathbf{v}}_{\text{t}}^{*} + {\text{v}}_{\text{t}} \), \( {\boldsymbol{\upbeta }}_{{{\text{xt}}}}^{\prime } = \left( {{\boldsymbol{\upalpha }}_{{1{\text{t}}}}^{\prime } + {\boldsymbol{\upalpha }}_{{2{\text{t}}}}^{\prime } {\boldsymbol{\Psi }}_{{\text{t}}} } \right)\left( {{\text{I}} - {\text{D}}_{{{\text{wt}}}} {\text{D}}_{{{\text{xt}}}}^{{ - 1}} } \right) \), \( {\mathbf{x}}_{\text{et}} = \left[ {1,{\mathbf{x}}^{\prime }_{\text{t}} } \right]^{\prime } \), \( {\boldsymbol{\upbeta }}_{\text{t}} = \left[ {\upbeta_{{0{\text{t}}}} ,{\boldsymbol{\upbeta }}^{\prime }_{\text{xt}} } \right]^{\prime } \) and \( {\text{D}}_{\text{wt}},{\text{D}}_{\text{xt}} \) are diagonal containing \( {\mathbf{w}}_{\text{t}} \) and \( {\mathbf{x}}_{\text{t}} \), respectively, along the diagonal. Finally, we assume there exists a vector \( {\mathbf{z}}_{\text{t}} \) of drivers such that
$$ {\boldsymbol{\upbeta }}_{{\text{t}}} = \Pi {\mathbf{z}}_{{{\text{et}}}} + {\boldsymbol{\upvarepsilon }}_{{\text{t}}} , $$
(8)
where \( {\mathbf{z}}_{\text{et}} = \left[ {1,{\mathbf{z}}^{\prime }_{\text{t}} } \right]^{\prime } \). Under the assumption:
$$ {\boldsymbol{\upvarepsilon }}_{{\text{t}}} \sim {\text{IN}}\left( {{\mathbf{0}},\Sigma } \right). $$
(9)
It is straightforward to obtain the following
$$ {\text{E}}\left( {{\text{y}}_{\text{t}} |{\mathbf{x}}_{\text{t}} ,{\mathbf{z}}_{\text{t}} } \right) = \left( {{\mathbf{z}}^{\prime }_{\text{et}} \otimes {\mathbf{x}}^{\prime }_{\text{et}} } \right){\text{vec}}\left( {\boldsymbol{\Pi}} \right) $$
(10)
In matrix notation we have
$$ {\text{E}}\left( {{\mathbf{y}}|{\mathbf{X}}_{\text{z}} } \right) = {\mathbf{X}}_{\text{z}} {\text{vec}}\left( {\boldsymbol{\Pi}} \right),{\text{cov}}\left( {{\mathbf{y}}|{\mathbf{X}}_{\text{z}} } \right) = \upsigma_{\text{a}}^{2} {\boldsymbol{\Omega}} $$
(11)
where \( {\text{X}}_{\text{z}} = \left( {{\mathbf{z}}_{{{\text{e}}1}} \otimes {\mathbf{x}}_{{{\text{e}}1}} ,\ldots,{\mathbf{z}}_{\text{eT}} \otimes {\mathbf{x}}_{\text{eT}} } \right)^{\prime } \), \( \upsigma_{\text{a}}^{2} {\boldsymbol{\Omega}} = {\mathbf{D}}_{\text{x}} \left( {{\text{I}}_{\text{T}} \otimes \upsigma_{\text{a}}^{2} {\boldsymbol{\Sigma}}} \right){\mathbf{D}}^{\prime }_{\text{x}} \), and \( {\mathbf{D}}_{\text{x}} = {\text{diag}}\left[ {{\mathbf{x}}^{\prime }_{{{\text{e}}1}} ,\ldots,{\mathbf{x}}^{\prime }_{\text{eT}} } \right] \). A restrictive version of (7) is
$$ {\text{y}}_{\text{t}} = {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\upbeta }}_{\text{t}} + {\text{u}}_{\text{t}} ,{\text{u}}_{\text{t}} \sim {\text{IN}}\left( {0,\upsigma_{\text{u}}^{2} } \right), $$
(12)
where \( \beta_{0t} \) is redefined as \( {{\upalpha }}_{{0{\text{t}}}} + {\text{v}}_{{\text{t}}} \) and \( \beta_{t} \) is independent of \( {\boldsymbol{\upalpha^{\prime}}}_{{2{\text{t}}}} {\mathbf{v}}_{\text{t}}^{*} \)  =  \( u_{t} \) and (8) as
$$ {\boldsymbol{\upbeta }}_{\text{t}} = {\boldsymbol{\Pi}}_{\text{r}} {\mathbf{z}}_{\text{et}} + {\boldsymbol{\upvarepsilon}}_{\text{rt}} $$
(13)
where the first row of \( {\Pi }_{r} \) post-multiplied by \( {\mathbf{z}}_{\text{et}} \) does not contain the mean of \( {\boldsymbol{\upalpha^{\prime}}}_{{ 2 {\text{t}}}} {\mathbf{v}}_{\text{t}}^{*} \), and the first element of \( {\boldsymbol{\upvarepsilon}}_{\text{rt}} \) is independent of \( {\boldsymbol{\upalpha^{\prime}}}_{{ 2 {\text{t}}}} {\mathbf{v}}_{\text{t}}^{*} \). The error vectors \( {\upvarepsilon }_{{\text{t}}} \) and \( {\boldsymbol{\upvarepsilon}}_{\text{rt}} \) are introduced in (8) and (13). Substituting we have
$$ {\text{y}}_{\text{t}} = {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\Pi}}_{\text{r}} {\mathbf{z}}_{\text{et}} + {\text{u}}_{\text{t}} + {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\upvarepsilon}}_{\text{rt}} = \left( {{\text{z}}_{\text{et}}^{\prime } \otimes {\text{x}}_{\text{et}}^{\prime } } \right){\text{vec}}\left( {{\boldsymbol{\Pi}}_{\text{r}} } \right) + {\text{u}}_{\text{t}} + {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\upvarepsilon}}_{\text{rt}} . $$
(14)
Defining \( {\text{z}}_{\text{et}} \otimes {\text{x}}_{\text{et}} = {\mathbf{X}}_{\text{t}} \), and \( {\boldsymbol{\uppi}}_{\text{r}} = {\text{vec}}\left( {{\boldsymbol{\Pi}}_{\text{r}} } \right) \), we have:
$$ {\text{y}}_{\text{t}} = {\mathbf{X}}^{\prime }_{\text{t}} {\boldsymbol{\uppi}}_{\text{r}} + {\text{u}}_{\text{t}} + {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\upvarepsilon}}_{\text{rt}} . $$
(15)
By assumption, \( {\text{E}}\left( {{\text{u}}_{\text{t}} + {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\upvarepsilon}}_{\text{rt}} } \right) = 0 \) and \( {\text{var}}\left( {{\text{u}}_{\text{t}} + {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\upvarepsilon}}_{\text{rt}} } \right) = \upsigma_{\text{u}}^{2} + {\mathbf{x}}^{\prime }_{\text{et}} \upsigma_{\text{r}}^{2} {\boldsymbol{\Sigma}}_{\text{r}} {\mathbf{x}}_{\text{et}} \).

3 Computational Aspects

Under a normality assumption in both \( {\text{u}}_{\text{t}} \) and \( {\boldsymbol{\upvarepsilon}}_{\text{rt}} \) the likelihood function is
$$ {\text{L}}(\uptheta;{\text{Y}}) \propto \prod\nolimits_{{{\text{t}} = 1}}^{\text{T}} {\left( {\upsigma_{\text{u}}^{2} + {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\Sigma}}_{\text{r}} {\mathbf{x}}_{\text{et}} } \right)^{ - 1/2} } \exp \left\{ { - \tfrac{1}{2}\sum\nolimits_{{{\text{t}} = 1}}^{\text{T}} {\frac{{\left( {{\text{y}}_{\text{t}} - {\mathbf{X}}^{\prime }_{\text{t}} {\boldsymbol{\uppi}}_{\text{r}} } \right)^{2} }}{{\left( {\upsigma_{\text{u}}^{2} + {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\Sigma}}_{\text{e}} {\mathbf{x}}_{\text{et}} } \right)}}} } \right\} $$
(16)
where the parameter vector is \( {\uptheta } = \left[ {{\boldsymbol{\uppi}}^{\prime }_{\text{r}} ,\upsigma_{\text{r}} ,\upsigma_{\text{u}} ,{\text{vech}}({\boldsymbol{\Sigma}}_{\text{r}} )^{\prime } } \right]^{\prime } \). Coupled with a prior, \( {\text{p}}\left(\uptheta \right) \), by Bayes’ theorem we get the kernel posterior distribution:
$$ p(\theta |Y) \propto L(\theta;Y)p(\theta),\,\,\theta \in \varTheta $$
(17)
We assume a standard non-informative prior:
$$ p(\varSigma_{r}) \propto |\varSigma_{r} |^{- (d + 1)/2} $$
where \( d \) is the dimensionality of \( \varSigma_{r} \). The prior of \( \pi \) will be detailed below.
Markov Chain Monte Carlo (MCMC) techniques can be used to obtain a sample \( \{ {\uptheta }^{{({\text{s}})}} ,{\text{s}} = 1,\ldots,{\text{S}}\} \) that converges in distribution to the posterior \( {\text{p}}({\uptheta }|{\text{Y}}) \). One efficient MCMC strategy is the following.
  1. (i)

    Obtain \( {\boldsymbol{\uppi}}_{\text{r}} \) from its conditional distribution:

     
  2. (ii)
    $$ {\boldsymbol{\uppi}}_{\text{r}} |\upsigma_{\text{r}} ,\upsigma_{\text{u}} ,{\boldsymbol{\Sigma}}_{\text{r}} ,{\mathbf{Y}} \sim {\text{N}}\left( {{\hat{\boldsymbol{\uppi }}}_{\text{r}} ,{\mathbf{V}}} \right) $$
    (18)
    where \( {\hat{\boldsymbol{\uppi }}}_{\text{r}} = \left( {{\mathbf{X}}^{\prime } {\boldsymbol{\Omega}}^{ - 1} {\mathbf{X}}} \right)^{ - 1} {\mathbf{X}}^{\prime } {\boldsymbol{\Omega}}^{ - 1} {\mathbf{y}},{\mathbf{V}} = \left( {{\mathbf{X}}^{\prime } {\boldsymbol{\Omega}}^{ - 1} {\mathbf{X}}} \right)^{ - 1} \), \( {\boldsymbol{\Omega}} = {\text{diag}}( \upsigma_{\text{u}}^{2} + {\mathbf{x}}^{\prime }_{\text{et}} {\boldsymbol{\Sigma}}_{\text{r}} {\mathbf{x}}_{\text{et}} ,{\text{t}} = 1,\ldots,{\text{T}}) \).
     
  3. (iii)

    Reparametrize \( {\boldsymbol{\Sigma}}_{\text{r}} \) using \( {\mathbf{C}} \) where \( {\boldsymbol{\Sigma}}_{\text{r}} = {\mathbf{C^{\prime}}}_{\text{r}} {\mathbf{C}}_{\text{r}} \), \( \sigma_{\text{u}} \propto \exp \left({c_{0}} \right) \) and \( \sigma_{r} \propto \exp (C_{00}) \). Assuming that different non-zero elements of \( {\mathbf{C}}_{\text{r}} \) are \( {\text{c}}_{1},\ldots,{\text{c}}_{\text{p}} \) the new parameter vector is \( {\boldsymbol{\uppi}}_{\text{r}} \) and \( {\mathbf{c}} = [{\text{c}}_{0} ,{\text{c}}_{{00}} ,{\text{c}}_{1} , \ldots ,{\text{c}}_{{\text{p}}} ]^{\prime } \in \mathbb{R}^{{{\text{p}} + 2}} \). Drawings from the conditional posterior distribution of \( {\mathbf{c}}|{\boldsymbol{\uppi}}_{\text{r}},{\mathbf{Y}} \) can be realized using the Girolami and Calderhead (2011) Metropolis Adjusted Langevin Diffusion method described in the “Appendix”.

     
If we define \(\widehat{{\boldsymbol{\upbeta }}}_{{\text{t}}} = \left( {{\mathbf{x}}_{{\text{t}}} {\mathbf{x}}^{\prime } _{{\text{t}}} + \upsigma ^{2} {\boldsymbol{\Sigma }}^{{ - 1}} } \right)^{{ - 1}} \left( {{\mathbf{x}}_{{\text{t}}} {\text{y}}_{{\text{t}}} + {\text{ }}\upsigma ^{2} {\boldsymbol{\Sigma }}^{{ - 1}} {\boldsymbol{\Pi} \mathbf{z}}_{{\text{t}}} } \right) \) and \( {\mathbf{V}}_{\upbeta {\text{t}}} = \left({{\mathbf{x}}_{\text{t}} {\mathbf{x^{\prime}}}_{\text{t}} + \upsigma^{2} {\boldsymbol{\Sigma}}^{- 1}} \right)^{- 1} \) we obtain:
$$ {\boldsymbol{\upbeta}}_{\text{t}} |\upsigma,{\boldsymbol{\Sigma}},{\text{Y}}\sim {\text{N}}\left({{\hat{\boldsymbol{\upbeta}}}_{\text{t}},{\mathbf{V}}_{\upbeta {\text{t}}}} \right) . $$
(19)

In this form we can avoid a possibly inefficient Gibbs sampler which relies on drawing \( {\boldsymbol{\Pi}} \) and \( {\boldsymbol{\Sigma}} \) from (13), \( \{{\boldsymbol{\upbeta}}_{\text{t}},{\text{t}} = 1,\ldots,{\text{T}}\} \) from (19) and \( \upsigma \) from (12).

Selecting the drivers

Suppose we have (12) and instead of (13) we have
$$ {\boldsymbol{\upbeta}}_{\text{t}} = {\boldsymbol{\Pi}}_{\text{m}} {\mathbf{z}}_{\text{t}}^{{({\text{m}})}} + {\boldsymbol{\upvarepsilon}}_{\text{t}}^{{({\text{m}})}},\forall {\text{m}} = 1,\ldots,{\text{M}}, $$
(20)
where \( {\mathbf{z}}_{\text{t}}^{{({\text{m}})}} \) is a potential set of drivers (subset) from a universe \( {\mathcal{Z}} = \left\{{{\mathbf{z}}_{{{\text{t}}1}},\ldots,{\mathbf{z}}_{{{\text{tG}}}}} \right\} \). Equations (12) and (20) define different models indexed by \( m \). As searching through all possible combinations of variables in \( {\mathcal{Z}} \) is infeasible, we follow the Stochastic Search Variable Selection (SSVS) approach of George et al. (2008).4 The SSVS involves a specific prior of the form:
$$ {\boldsymbol{\uppi}}|{\boldsymbol{\updelta}}\sim {\text{N}}({\mathbf{0, D}}) $$
(21)
where \( {\boldsymbol{\updelta}} \) is a vector of unknown parameters and its elements can be \( \updelta_{\text{j}} \in \{0,1\} \) . Also \( {\mathbf{D}} = {\text{diag}}\left[ {{\text{d}}_{1}^{2} , \ldots ,{\text{d}}_{{{\text{G}}^{2} }} } \right] \):
$$ {\text{d}}_{{\text{j}}}^{2} = {\underline{\upkappa }} _{{0{\text{j}}}}^{2} ,\;{\text{if}}\;\updelta _{{\text{j}}} = 0,\;{\text{and}}\;{\text{d}}_{{\text{j}}}^{2} = {\underline{\upkappa }} _{{1{\text{j}}}}^{2} ,\;{\text{if}}\;\updelta _{{\text{j}}} = 1. $$
(22)
The prior implies a mixture of two normals:
$$ {\uppi }_{j} |\updelta _{j} \sim(1 - \updelta _{j} )N\left( {0,\underline{\upkappa } _{{0j}}^{2} } \right) + \updelta _{j} {\text{N}}\left( {0,\underline{\upkappa } _{{1j}}^{2} } \right) $$
(23)
If \( \underline{\upkappa } _{{0{\text{j}}}} \) is “small” and \( \underline{\kappa } _{{1{\text{j}}}} \) is “large”, then, when \( \updelta_{\text{j}} = 0 \) chances are that variable j will be excluded from the model while if \( \updelta_{\text{j}} = 1 \) chances are that variable j will be included in the model. The prior for the indicator parameter \( {\boldsymbol{\updelta}} \) is:
$$ {\text{P}}({\updelta }_{{\text{j}}} = 1) = \underline{{\text{q}}} _{{\text{j}}} ,\;{\text{P}}({\updelta }_{{\text{j}}} = 0) = 1 - \underline{{\text{q}}} _{{\text{j}}} $$
(24)
and we set \( \underline{{\text{q}}} _{{\text{j}}} = \tfrac{1}{2} \). For \( \underline{\kappa } _{{0{\text{j}}}} \) and \( \underline{\kappa } _{{1{\text{j}}}} \), George et al. (2008) propose a semi-automatic procedure based on \( \underline{\upkappa } _{{0{\text{j}}}}^{2} = {\text{c}}_{0} \hat{\text{v}}\left( {\uppi _{\text{j}} } \right) \) and \( \underline{\upkappa } _{{1{\text{j}}}}^{2} = {\text{c}}_{1} \hat{\text{v}}\left( {\uppi _{\text{j}} } \right) \) for \( {\text{c}}_{0} = \tfrac{1}{{10}},\;{\text{c}}_{1} = 10 \) and \( {\hat{\text{v}}}\left( {{\uppi }_{{\text{j}}} } \right) \) is any preliminary estimate of the variance of \( {\uppi }_{{\text{j}}} \).

For the elements of \( {\mathbf{c}} \) we follow a similar approach. If \( {\text{c}}_{{\text{j}}} \) corresponds to a diagonal element it is always included in the model. If not, we use a mixture-of-normals SSVS approach as above.

4 Monte Carlo results

In all cases below \( \upgamma_{0} = \upgamma_{1} = \upgamma_{2} = .1 \). The number of Monte Carlo simulations is set to 10,000. All \( \upvarepsilon_{tj} \sim iidN(0,1) \). In case IV, we set \( \upsigma_{\upvarepsilon} = .3 \).

4.1 Model I: Incorrect Functional Form

The true model is \( {\text{y}}_{\text{t}} = \upgamma_{0} + \upgamma_{1} {\text{x}}_{\text{t}} + \tfrac{1}{2}\upgamma_{2} {\text{x}}_{\text{t}}^{2},\:{\text{t}} = 1, \ldots,{\text{T}} \) and we have omitted the nonlinear term. The driver is \( {{z}}_{{t}} = \alpha {{x}}_{{t}} + \upvarepsilon_{{t}} \). We have \( \upvarepsilon_{\text{t}} \sim {\text{iidN(0,1)}} \) and \( {\text{x}}_{{\text{t}}} \sim{\text{iidN(1,1)}} \). In this case the correlation between \( {\text{z}}_{\text{t}} \) and \( {\text{x}}_{{\text{t}}}^{2} \) is \( \uprho = \frac{3\alpha}{{\sqrt {3\left({3\alpha^{2} + 1} \right)}}} \).

If the correlation were equal to 1, then this would be a perfect driver as it exactly recreates the missing quadratic term. The estimation procedure would then be unbiased and efficient. If the correlation were zero, then zt would contain no information about the missing nonlinearity. We are, therefore, interested in varying this correlation and seeing how low the correlation can fall before the estimator ceases to be useful.

In this case the true effect is \( \upgamma_{1} + \upgamma_{2} {\text{x}}_{\text{t}} \), that is, the derivative of \( y \) with respect to \( x \). There are no omitted variables or other misspecifications other than the nonlinearity so the set S2 is empty and the estimate of the derivative is given by \( \upgamma_{1} + \upgamma_{2} {\text{x}}_{\text{t}} = \beta_{1} - \varepsilon_{{t}} \).

Table 1 gives the results of this set of Monte Carlo experiments for sample sizes of 50, 100, 200 and 1000. When the correlation between z and x is very high, then even for small samples the bias is very small and the standard deviation of the results is also small at around 1%. As the sample size grows, both the bias and the standard deviation fall, and the estimator is clearly consistent and efficient. As we look across the table, where the correlation between the driver and the true x variable falls the estimation procedure still does very well until the correlation falls to about .5; at that point the bias and the standard error begins to rise quite substantially. This happens even more clearly with the very large sample size of T = 1000 where both the bias and standard error are very small until the correlation falls below .5.
Table 1

Monte Carlo results for Model I

Corr \( \rho \)

.95

.90

.80

.70

.60

.50

.40

.30

.20

.10

.00

T = 50

 Bias

.017

.017

.025

.028

.035

.048

.071

.098

.117

.125

.205

 SD

.011

.011

.012

.019

.041

.057

.091

.120

.144

.189

.265

T = 100

 Bias

.007

.007

.009

.017

.022

.035

.077

.114

.135

.181

.235

 SD

.008

.008

.007

.014

.035

.052

.128

.140

.192

.272

.301

T = 200

 Bias

.004

.004

.007

.012

.011

.070

.101

.177

.186

.244

.293

 SD

.006

.006

.005

.007

.012

.044

.177

.186

.281

.316

.387

T = 500

 Bias

.003

.003

.005

.008

.011

.079

.136

.218

.222

.271

.332

 SD

.004

.004

.006

.007

.032

.055

.190

.277

.334

.389

.415

T = 1000

 Bias

.001

.001

.003

.005

.007

.065

.225

.280

.345

.381

.414

 SD

.003

.003

.005

.009

.041

.062

.217

.305

.376

.414

.520

Corr \( \rho \) is the degree of correlation between the true driver and the driver used. T is the sample size, bias is the percent absolute bias; SD is the standard deviation of the bias

4.2 Model II: Omitted Variables

The second model focuses on omitted variables. The true model is \( {\text{y}}_{\text{t}} = \upgamma_{0} + \upgamma_{1} {\text{x}}_{{{\text{t}}1}} + \upgamma_{2} {\text{x}}_{{{\text{t}}2}} \). The \( {\text{x}}_{{{\text{t}}1}},x_{{{\text{t}}2}} \) are correlated: \( {\text{x}}_{{{\text{t}}2}} = \upgamma {\text{x}}_{{{\text{t}}1}} + \upxi_{\text{t}},\upxi_{\text{t}},{\text{x}}_{{{\text{t}}1}} \sim {\text{iidN(0,1)}} \). The squared correlation between the two variables is \( \uprho_{12}^{2} = \frac{{\upgamma^{2}}}{{\upgamma^{2} + 1}} \). We set \( \upgamma = 2 \) so that this is .80.

We estimate the TVC model \( {\text{y}}_{\text{t}} = \upbeta_{{0{\text{t}}}} + \upbeta_{{1{\text{t}}}} {\text{x}}_{{{\text{t}}1}} \) and again use a driver \( {\text{z}}_{\text{t}} = \alpha {\text{x}}_{{{\text{t}}2}} + \upvarepsilon_{\text{t}} \) and we see how well the estimator performs as the correlation between \( {\text{z}}_{\text{t}} \) and \( {\text{x}}_{{{\text{t}}2}} \) falls. The correlation between \( {\text{z}}_{\text{t}} \) and \( {\text{x}}_{{{\text{t}}2}} \) is \( \uprho = \frac{\upalpha}{{\sqrt {\upalpha^{2} + 1}}} \).

In this case, the true effect is \( \upgamma_{1} \) and the bias free estimate is \( \upbeta_{1\text{t}} - \uppi_{1} {\text{z}}_{\text{t}} - {\text{e}}_{\text{t}} \).

The results of this experiment are given in Table 2. The results show a similar picture to case 1 above. Both the bias and the standard deviation clearly decrease as the sample size increases. Even for the smallest sample size both the bias and the standard deviation are quite small while the correlation between the driver and the misspecification is above .5. Again, as the correlation falls below .5 the bias and standard deviation rise quite quickly.
Table 2

Monte Carlo results for Model II

Corr \( \rho \)

.95

.90

.80

.70

.60

.50

.40

.30

.20

.10

.00

T = 50

 Bias

.021

.021

.029

.032

.038

.041

.058

.069

.082

.098

.125

 SD

.013

.013

.015

.019

.022

.033

.066

.083

.102

.128

.155

T = 100

 Bias

.014

.014

.021

.028

.032

.055

.067

.091

.122

.144

.171

 SD

.009

.009

.012

.017

.019

.028

.077

.107

.135

.176

.193

T = 200

 Bias

.009

.009

.017

.022

.027

.051

.083

.129

.142

.185

.215

 SD

.007

.008

.010

.015

.017

.020

.154

.196

.226

.287

.303

T = 500

 Bias

.007

.008

.011

.018

.022

.049

.124

.155

.189

.212

.288

 SD

.006

.007

.008

.013

.016

.022

.187

.234

.288

.317

.355

T = 1000

 Bias

.005

.006

.008

.010

.017

.047

.171

.222

.287

.334

.345

 SD

.004

.005

.007

.011

.014

.020

.199

.276

.302

.344

.381

Corr \( \rho \) is the degree of correlation between the true driver and the driver used. T is the sample size, bias is the percent absolute bias; SD is the standard deviation of the bias

4.3 Model III: Measurement Error

The third model deals with measurement error, so we generate data from \( {\text{y}}_{\text{t}} = \upgamma_{0} + \upgamma_{1} {\text{x}}_{\text{t}} \) then create \( {\text{y}}_{\text{t}}^{*} = {\text{y}}_{\text{t}} + \upvarepsilon_{t1} \) and \( {\text{x}}_{\text{t}}^{*} = {\text{x}}_{\text{t}} + \upvarepsilon_{\text{t2}} \) then we estimate the TVC model \( {\text{y}}_{{\text{t}}}^{*} = \upbeta _{0} {\text{t}} + \upbeta _{{1{\text{t}}}} {\text{x}}_{{\text{t}}}^{*} \) and use two z’s as drivers \( {\text{z}}_{\text{t1}} = \upalpha_{1} \upvarepsilon_{\text{t1}} + \upvarepsilon_{\text{t3}} \) and \( {\text{z}}_{{{\text{t2}}}} = {\upalpha }_{2} {\upvarepsilon }_{{{\text{t2}}}} + {\upvarepsilon }_{{{\text{t4}}}} \) and again see how things change as \( \upalpha \) gets bigger.

The results of this experiment are reported in Table 3. The results are entirely consistent with the results in the earlier two cases. The technique is clearly consistent, as the sample rises the bias falls considerably. Even for a small sample the bias is quite low for correlations between the driver and the measurement error which is .5 or above.
Table 3

Monte Carlo results for Model III, ρε1,z1 = .50

Corr \( \rho \)

.95

.90

.80

.70

.60

.50

.40

.30

.20

.10

.00

T = 50

 Bias

.022

.024

.031

.036

.041

.055

.062

.077

.085

.097

.105

 SD

.011

.011

.015

.019

.022

.037

.055

.071

.080

.092

.101

T = 100

 Bias

.018

.019

.022

.029

.035

.050

.071

.082

.091

.108

.117

 SD

.009

.009

.015

.021

.030

.047

.077

.093

.105

.120

.146

T = 200

 Bias

.009

.009

.015

.021

.031

.047

.087

.095

.119

.126

.141

 SD

.007

.008

.012

.019

.027

.045

.090

.101

.138

.155

.188

T = 500

 Bias

.007

.008

.011

.016

.027

.040

.090

.122

.139

.155

.196

 SD

.004

.005

.009

.017

.020

.039

.115

.137

.152

.188

.217

T = 1000

 Bias

.005

.005

.007

.009

.016

.032

.117

.144

.162

.196

.213

 SD

.003

.003

.006

.010

.016

.030

.141

.166

.195

.225

.255

Corr \( \rho \) is the degree of correlation between the true driver and the driver used. T is the sample size, bias is the percent absolute bias; SD is the standard deviation of the bias

4.4 Detecting Irrelevant Drivers

Next, we examine whether the SSVS5 procedure, which we have not applied so far, can correctly identify the drivers \( {\text{z}}_{\text{t1}},{\text{z}}_{\text{t2}} \). To this end, we construct ten other drivers, say \( {\text{z}}_{\text{t2}},\ldots,{\text{z}}_{t,12} \) from a multivariate normal distribution with zero means and equal correlations of .70. In Table 4 we report the equivalent of Table 3 plus the proportion of cases, say \( \Pi^{*} \), in which SSVS has correctly excluded \( {\text{z}}_{\text{t2}},\ldots,{\text{z}}_{\text{t,12}} \) from the set of possible drivers.6
Table 4

Monte Carlo results for Model III, ρε1,z1 = .50, SSVS

Corr \( \rho \)

.95

.90

.80

.70

.60

.50

.40

.30

.20

.10

.00

T = 50

 Bias

.025

.026

.033

.038

.044

.059

.067

.079

.091

.099

.109

 SD

.012

.012

.016

.020

.023

.039

.057

.075

.086

.095

.114

\( \Pi * \)

60.5%

60.0%

58.2%

57.3%

51.3%

33.3%

12.2%

8.3%

4.5%

.0%

.0%

T = 100

 Bias

.019

.020

.024

.031

.038

.053

.075

.087

.096

.112

.120

 SD

.009

.011

.017

.023

.034

.049

.079

.098

.114

.126

.151

\( \Pi * \)

71.2%

71.0%

62.3%

64.8%

58.2%

55.4%

9.3%

7.5%

3.3%

.0%

.0%

T = 200

 Bias

.012

.012

.019

.027

.035

.049

.089

.099

.121

.127

.148

 SD

.008

.008

.015

.021

.029

.047

.082

.103

.140

.159

.192

\( \Pi * \)

87.3%

87.0%

77.3%

71.2%

61.5%

59.2%

8.2%

3.7%

.0%

.0%

.0%

T = 500

 Bias

.009

.009

.014

.019

.029

.043

.094

.128

.140

.162

.200

 SD

.005

.006

.012

.019

.023

.040

.119

.141

.158

.193

.232

\( \Pi * \)

97.3%

96.5%

93.4%

85.5%

79.3%

62.7%

4.4%

1.0%

.0%

.0%

.0%

T = 1000

 Bias

.006

.006

.009

.015

.019

.035

.121

.147

.168

.201

.217

 SD

.004

.004

.007

.015

.018

.034

.144

.169

.198

.230

.266

\( \Pi * \)

99.5%

98.3%

97.7%

91.2%

85.2%

77.7%

2.1%

.0%

.0%

.0%

.0%

Corr \( \rho \) is the degree of correlation between the true driver and the driver used. T is the sample size, bias is the percent absolute bias; SD is the standard deviation of the bias. \( \Pi ^{*} \) is the proportion of times the correct driver set is selected

There is again a remarkable cut off at the correlation level of .5. Above this level the true driver set is correctly identified in around 60% of cases and for the largest sample in over 90% of cases, even for small samples. Once the correlation falls below .5, however, the proportion of correct identifications falls dramatically. An obvious conclusion here is that when we have drivers that are effective enough so that we will get reasonably good parameter estimates, the SSVS algorithm is very effective at detecting them.

4.5 Model IV: A More Complex Nonlinearity

The true model is \( {\text{y}}_{\text{t}} ={\upgamma}_{ 0} +{\upgamma}_{ 1} {\text{x}}_{\text{t}} + {\text{ exp}}\left({- {\updelta}{\text{x}}_{\text{t}}^{2}} \right) +{\upvarepsilon}_{\text{t}} \), t = 1,…,T and we have omitted the nonlinear term. The drivers form a Fourier basis \( \left\{ {\cos ({\text{jx}}_{\text{t}} ),\sin ({\text{jx}}_{\text{t}} ),{\text{j = 1,}} \ldots {\text{J}}} \right\} \) after transforming all series to lie in (− π, π). We have \( {\upvarepsilon}_{\text{t}} \sim {\text{iidN(0,1)}} \) and \( {\text{x}}_{\text{t}} \sim {\text{iidN(0,1)}} \) ordered from smallest to largest. The drivers, that is powers of \( {\text{x}}_{\text{t}} \) are selected through the SSVS procedure. We set the maximum value of J to 10.

We again estimate the TVC model \( {\text{y}}_{\text{t}} = \upbeta_ 0 {\text{t}} + \upbeta_{\text{1t}} x_{t1} \) and this time the derivative of y with respect to x is \( \upgamma_{1} - 2\delta {\text{x}}_{\text{t}} \exp (- \delta {\text{x}}_{\text{t}}^{2}) \). Our estimate of this is again given by \( \beta_{{1{t}}} - {\text{e}}_{\text{t}} \).

The results for this exercise are given in Table 5. In this case for δ in the range .1–5 the bias remains very small, as does the standard deviation. There is also a noticeable reduction in both bias and standard deviation as the sample size increases.
Table 5

Monte Carlo results for Model IV, nonlinearity, SSVS/Fourier basis

\( \delta \)

.1

.3

.5

1.00

5.00

T = 50

 Bias

.022

.025

.028

.031

.035

 SD

.014

.015

.016

.019

.023

T = 100

 Bias

.017

.022

.025

.029

.032

 SD

.011

.012

.014

.017

.020

T = 200

 Bias

.013

.015

.017

.019

.021

 SD

.008

.009

.011

.012

.017

T = 500

 Bias

.009

.010

.012

.014

.018

 SD

.005

.007

.009

.010

.014

T = 1000

 Bias

.005

.006

.007

.011

.013

 SD

.003

.004

.006

.009

.011

\( \delta \) is the degree of missing nonlinearity given as in Sect. 4.5 above. T is the sample size. Bias is the percent absolute bias; SD is the standard deviation of the bias

4.6 Model V: An Endogeneity Experiment

In this experiment we have: \( {\text{y}}_{\text{t}} = {\upgamma}_{1} + {\upgamma}_{2} {\text{x}}_{\text{t1}} + {\upgamma}_{3} {\text{x}}_{\text{t2}} + {\text{u}}_{\text{t}} \). The correlation between \( {\text{u}}_{\text{t}} \) and \( {\text{x}}_{\text{tj}} \) is .80 (j = 1, 2) so that endogeneity is quite strong in this model. Our drivers are four variables \( {\text{z}}_{{{\text{t}}5}}, \ldots,{\text{z}}_{{{\text{t}}8}} \) orthogonal to the error \( {\text{u}}_{\text{t}} \) and four drivers \( {\text{z}}_{{{\text{t}}1}}, \ldots,{\text{z}}_{{{\text{t}}4}} \) which are correlated with the error \( {\text{u}}_{\text{t}} \) but they are orthogonal to each other as well as orthogonal to the other four drivers. The degree of correlation between the drivers and \( {\text{u}}_{\text{t}} \) is \( {\uprho} \). We are interested in \( \Pi^{*} \), the proportion of cases where all the drivers \( {\text{z}}_{{{\text{t}}1}}, \ldots,{\text{z}}_{{{\text{t}}4}} \) are included in the model and the drivers \( {\text{z}}_{{{\text{t}}5}}, \ldots,{\text{z}}_{{{\text{t}}8}} \) are all excluded. Of course, we do not force the correct drivers in final estimation.

The results for this experiment are given in Table 6. Here the table has some rather different results than the earlier tables. The bias remains quite high, even for quite high correlations for the sample size of 50 or 100. It is only for much larger sample sizes that the bias becomes negligible. For larger sample sizes the bias remains small again for correlations above .5 and the SSVS selection procedure works reasonably well.
Table 6

Monte Carlo results for Model V, Endogeneity, SSVS

Corr \( \rho \)

.1

.3

.5

.7

.9

T = 50

 Bias

.491

.401

.387

.266

.212

 SD

.716

.710

.688

.644

.601

\( \Pi^{*} \)

.044

.081

.225

.447

.617

T = 100

 Bias

.386

.316

.303

.201

.138

 SD

.355

.350

.281

.277

.252

\( \Pi^{*} \)

.051

.128

.315

.517

.645

T = 200

 Bias

.300

.216

.287

.181

.101

 SD

.314

.310

.277

.201

.196

\( \Pi^{*} \)

.081

.201

.403

.615

.717

T = 500

 Bias

.295

.290

.101

.087

.071

 SD

.201

.200

.096

.061

.055

\( \Pi^{*} \)

.091

.261

.462

.687

.775

T = 1000

 Bias

.282

.280

.047

.031

.028

 SD

.195

.193

.032

.027

.016

\( \Pi^{*} \)

.101

.316

.518

.784

.801

T = 10,000

 Bias

.190

.190

.039

.001

.001

 SD

.182

.182

.030

.011

.008

\( \Pi^{*} \)

.115

.320

.615

.813

.917

Corr \( \rho \) is the degree of correlation between the true driver and the driver used. \( \Pi^{*} \) is the proportion of cases where all the drivers \( {\text{z}}_{{{\text{t}}1}}, \ldots,{\text{z}}_{{{\text{t}}4}} \) are included in the model and the drivers \( {\text{z}}_{{{\text{t}}5}}, \ldots,{\text{z}}_{{{\text{t}}8}} \) are all excluded. T is the sample size. Bias is the percent absolute bias; SD is the standard deviation of the bias

5 Conclusions

This paper has investigated the performance of the TVC estimation procedure in a Monte Carlo setting. The key element of TVC estimation is the identification and selection of a set of driver variables. With an ideal driver set, it is straightforward to show that the procedure is both consistent and efficient. However, in practice it is not possible to know if we have a perfect driver set. Therefore, we need to know how the procedure performs when the driver set is less than perfect. In this paper, we dealt with this issue in a Monte Carlo setting.

We construct a number of Monte Carlo experiments to examine the performance of the methodology under (i) clearly-defined conditions and (ii) a range of model misspecifications. We also propose a new Bayesian search technique for the set of driver variables underlying the TVC methodology. Experiments are performed to allow for incorrectly specified functional form, omitted variables, measurement errors, unknown nonlinearity and endogeneity. Our broad conclusion is that, even for relatively small samples, the technique works well so long as the correlation between the driver set and the misspecification in the model is greater than about .5. Both the bias and the efficiency of the estimators also improve as the sample size grows, but again a correlation of over .5 seems to be required. The only caveat to this result is that if we are considering strong simultaneity bias; in that case the sample size needs to be quite large (over 500) before the technique works reasonably well. Finally, we find that the SSVS technique also seems to perform well in finding an appropriate driver set from a much larger set of possible drivers.

Footnotes

  1. 1.

    Swamy et al. (2003, 2010, 2012, 2015).

  2. 2.

    Empirical applications include Hall et al. (2009, 2010, 2017), Tavlas et al. (2013), Hondroyiannis et al. (2013), Kenjegaliev et al. (2013).

  3. 3.

    Time varying coefficients are meaningless in a cross section setting; in such a setting the coefficients vary across the individual units in the cross section. We simply re-interpret the t-subscript as i-subscripts.

  4. 4.

    See also Jochmann et al (2010).

  5. 5.

    An alternative to using the SSVS procedure would be the LASSO prior. The procedures are similar in terms of timing and purpose. There is some evidence that both perform well (Pavlou et al. 2016) and in a similar manner but further work is needed in this area.

  6. 6.

    There is an issue here as to whether we need to start from a superset of drivers which includes the true ones. Clearly, if we do this then this is an ideal situation and the Monte Carlo tells us how well the procedure performs. However, from a theoretical point of view what we need is that the superset includes variables be highly correlated with the true drivers. In a data rich environment this would not be a strong restriction. Bai and Ng (2010) prove that common factors which drive all the variables in a system are valid instrumental variables. By the same reasoning, we could construct a set of factors from a large set of variables which would work well as drivers.

  7. 7.

    This guarantees the existence of moments up to order four.

References

  1. Bai, J., & Ng, S. (2010). Instrumental variable estimation in a data rich environment. Econometric Theory, 26, 1577–1606.CrossRefGoogle Scholar
  2. George, E., Sun, D., & Ni, S. (2008). Bayesian stochastic search for VAR model restrictions. Journal of Econometrics, 142, 553–580.CrossRefGoogle Scholar
  3. Girolami, M., & Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2), 123–214.CrossRefGoogle Scholar
  4. Granger, C. W. J. (2008). Nonlinear models: Where do we go next—Time-varying parameter models? Studies in Nonlinear Dynamics and Econometrics, 12(3), 1–9.Google Scholar
  5. Hall, S. G., Hondroyiannis, G., Swamy, P. A. V. B., & Tavlas, G. S. (2009). Where has all the money gone? Wealth and the demand for money in South Africa. Journal of African Economies, 18(1), 84–112.CrossRefGoogle Scholar
  6. Hall, S. G., Hondroyiannis, G., Swamy, P. A. V. B., & Tavlas, G. S. (2010). The fisher effect puzzle: A case of non-linear relationship. Open Economies Review.  https://doi.org/10.1007/s11079-009-9157-1.CrossRefGoogle Scholar
  7. Hall, S. G., Swamy, P. A. V. B., & Tavlas, G. S. (2017). Time-varying coefficient models: A proposal for selecting the coefficient driver sets. Macroeconomic Dynamics, 21(5), 1158–1174.CrossRefGoogle Scholar
  8. Hondroyiannis, G., Kenjegaliev, A., Hall, S. G., Swamy, P. A. V. B., & Tavlas, G. S. (2013). Is the relationship between prices and exchange rates homogeneous. Journal of International Money and Finance, 37, 411–436.  https://doi.org/10.1016/j.jimonfin.2013.06.014.CrossRefGoogle Scholar
  9. Jochmann, M., Koop, G., & Strachan, R. W. (2010). Bayesian forecasting using stochastic search variable selection in a VAR subject to breaks. International Journal of Forecasting, 26(2), 326–347.CrossRefGoogle Scholar
  10. Kenjegaliev, A., Hall, S. G., Tavlas, G. S., & Swamy, P. A. V. B. (2013). The forward rate premium puzzle: A case of misspecification? Studies in Nonlinear Dynamics and Econometrics, 3, 265–280.  https://doi.org/10.1515/snde-2013-0009.CrossRefGoogle Scholar
  11. Pavlou, M., Ambler, G., Seaman, S., De Lorio, M., & Omar, Z. (2016). Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events. Statistics in Medicine, 37(7), 1159–1177.CrossRefGoogle Scholar
  12. Swamy, P. A. V. B., Chang, I.-L., Mehta, J. S., & Tavlas, G. S. (2003). Correcting for omitted-variable and measurement-error bias in autoregressive model estimation with panel data. Computational Economics, 22, 225–253.CrossRefGoogle Scholar
  13. Swamy, P. A. V. B., Hall, S. G., & Tavlas, G. S. (2012). Generalized cointegration: A new concept with an application to health expenditure and health outcomes. Empirical Economics, 42, 603–618.  https://doi.org/10.1007/s00181-011-0483-y.CrossRefGoogle Scholar
  14. Swamy, P. A. V. B., Hall, S. G., & Tavlas, G. S. (2015). A note on generalizing the concept of cointegration. Macroeconomic Dynamics, 19(7), 1633–1646.  https://doi.org/10.1017/S1365100513000928.CrossRefGoogle Scholar
  15. Swamy, P. A. V. B., Hall, S. G., Tavlas, G. S., & Hondroyiannis, G. (2010). Estimation of parameters in the presence of model misspecification and measurement error. Studies in Nonlinear Dynamics & Econometrics, 14, 1–35.CrossRefGoogle Scholar
  16. Swamy, P. A. V. B., & Mehta, J. S. (1975). Bayesian and non-Bayesian analysis of switching regressions and a random coefficient regression model. Journal of the American Statistical Association, 70, 593–602.Google Scholar
  17. Tavlas, G. S., Swamy, P. A. V. B., Hall, S. G., & Kenjegaliev, A. (2013). Measuring currency pressures: The cases of the Japanese Yen, the Chinese Yuan, and the U.K. pound. Journal of the Japanese and International Economies, 29, 1–20.  https://doi.org/10.1016/j.jjie.2013.04.001.CrossRefGoogle Scholar

Copyright information

© The Author(s) 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.School of BusinessUniversity of LeicesterLeicesterUK
  2. 2.Bank of GreeceAthensGreece
  3. 3.Lancaster UniversityLancasterUK
  4. 4.Athens University of Economics and BusinessAthensGreece

Personalised recommendations