A new approach for estimating VAR systems in the mixed-frequency case

In this paper we present a new estimation procedure named MF-IVL for VAR systems in the case of mixed-frequency data, where the data maybe, e.g., stock or flow data. The main idea of this new procedure is to project the slow components on the present and past fast ones in order to create instrumental variables. This procedure is shown to be generically consistent. Our claim is that the procedure is fast and more accurate when compared to the extended Yule-Walker procedure. A comparison of these two procedures is given by simulation.


Introduction
We propose a simple and fast algorithm for estimating the parameters in a multivariate high-frequency VAR system from mixed-frequency data. The VAR system is of the form y t = y f t y s t = A 1 y t−1 + · · · + A p y t− p + ν t , t ∈ Z, (1.1) where A i ∈ R n×n and the AR order p is given. Throughout we assume the stability condition det (a(z)) = 0 |z| ≤ 1, (1.2) where a(z) = I n − A 1 z − · · · − A p z p . Here z is used for the complex variable as well as for the backward shift on the integers Z. We assume that (ν t ) is white noise and we only consider the stable steady state solution y t = a(z) −1 ν t . The innovation covariance matrix is assumed to be non-singular. The parameter space for the high-frequency models considered is: (1.5) (1.6) In this paper we consider the problem of estimating the parameters of the n-dimensional high-frequency VAR model (1.1) using mixed-frequency data. We actually observe mixed-frequency data of the form where c i ∈ R, 1 < N ∈ N and at least one c i = 0. Here the n f -dimensional, say, fast component y f t is observed at the highest (sampling) frequency t ∈ Z and the n s -dimensional slow component w t is observed only for t ∈ N Z, i.e. for every N -th time point. In this paper we assume that n f ≥ 1. The population second moments, which can be directly observed, are of the form Generic identifiability of the high-frequency parameters A i , i = 1, . . . , p and ν has been shown in Anderson et al. (2016) (Theorems 2 and 3). Estimation procedures, in particular, a procedure based on the extended Yule-Walker (XYW) equations [see Chen and Zadrozny (1998)] and a procedure based on the Gaussian Likelihood as well as an EM algorithm are discussed in Koelbl et al. (2016) and Koelbl (2015). There it is shown that the MLE as well as the EM estimator heavily depend on the initial estimator used. The purpose of this paper is to describe an estimation procedure which can be used as an initial estimator, e.g. for the EM algorithm, but also as an estimator on its own, because it is easy to calculate, consistent and outperforms the estimator based on the XYW equations.

The stock case
For the case of stock variables (i.e. c 1 = 1, c i = 0, i = 2, . . . , N ) the second moments, which can be directly observed, are: In Anderson et al. (2016) it is shown that the system parameters can be generically Note, that the second moments on the left as well as those on the right hand side of the above equation can be directly observed in the mixed-frequency stock case. In Anderson et al. (2016) Theorem 2, it is shown that Z 0 has generically full row rank and therefore we generically obtain (A 1 , . . . , estimators are obtained by replacing the population second moments by their sample counterparts:γ where the estimator of γ s f (h) has only (approximately) 1 /N-th of the summands compared to the estimator of γ f f (h) due to the missing observations [see Koelbl et al. (2016)].
The new estimation procedure proposed is as follows: The basic idea is to generate instrumental variables by projecting the slow components y s t on the space generated by present and a sufficient number of lagged fast components y f j . To be more precise, let, for a suitable chosen 1 ≤ k ≤ t, H f k (t) = span y f j : t − k ≤ j ≤ t be the Hilbert space spanned by the one-dimensional components of the y f j in the underlying space of square integrable random variables L 2 over ( , A, P) and let x k t|t−1 denote the (componentwise) projection of the state , we obtain, using an obvious notation, In a first step we show that the matrix E It follows that f f (k 0 ) > 0 which is a direct conse-quence of ν > 0. The projection x k 0 t|t−1 is obtained by using the OLS formula since Z 0 has generically full row rank. This implies that generically holds, since x k t|t−1 is uncorrelated with x k t|t − x k t|t−1 and ν k t|t . Note that, for k ≥ p−1, An estimator of the state x k t+1|t , denoted byx k t+1|t , can be constructed as follows: W.l.o.g. let p = 2 and N = 2. The first n components ofx k t+1|t can be estimated by projecting y t onto y f t , . . . , y f t−k . This can be done by estimating β 1 in Letβ 1,T denote the OLS estimator of β 1 . Then we obtain (I n , 0)x k t+1|t =β 1,T Y − t,k , t ∈ Z. The second n components of x k t+1|t must be, due to the mixed-frequency structure and N = 2, estimated in a different way: Analogously to (2.8) we can construct but now we cannot directly observe the left hand side of (2.9). Therefore, we must shift (2.9) to (2.10) which directly leads us to the OLS estimatorβ 2,T of β 2 , since the left as well as the right hand side of (2.10) can be directly observed. In a last step we can construct the remaining part of the state with the help of (0, I n )x k t+1|t =β 2,T Y − t,k , t ∈ Z, which leads us tox (2.11) Using these instrumental variables, we can estimate A according to (2.7): (2.12)

Theorem 1 Under the additional assumptions, that lim
Proof Again we assume that p = 2 and N = 2. The above condition lim T →∞ where γ ( j) = E y t y T t− j . In a next step, let us write (2.6) as T Equation (2.13) implies thatβ 1,T andβ 2,T are consistent estimators for β 1 and β 2 , respectively. Thus, we obtain that . This concludes the proof.
Of course the choice of k is important for estimating the system parameters. Our approach is to regress y s t on y f t , . . . , y f t−k and to determine the maximum lag k by using AIC. Note that the structure of the matrix A, as far as the a priori zeros and ones are concerned, is not preserved by the estimation procedure (2.12). For this reason, we define a new estimator for the system parameters aŝ Clearly,Â T is also consistent. As shown in Anderson et al. (2016), the innovation covariance matrix ν can be generically consistently estimated according to the following formula where G = (I n , 0, . . . , 0) and where ⊗ denotes the Kronecker symbol. Letˆ ν denote the corresponding estimator.
Note that the estimatorÂ T (denoted by MF-IVL estimator) neither necessarily gives a stable AR system, nor isˆ ν necessarily positive definite. Projecting a symmetric matrix on the space of positive definite symmetric matrices is in a certain sense a standard procedure (see Higham 1989;Koelbl 2015). Projecting unstable system parameters on the space of stable ones is described in Koelbl (2015) and, for the univariate case, in Orbandexivry et al. (2013). Projecting slow variables on fast lagged variables is also mentioned in Ghysels et al. (2007).

The flow case
For the case of the more general observation scheme (1.8), we proceed as follows: Let (2.18) Let f k t+1|t denote the projection of f t+1 on the space H f k (t). Projecting both sides of (2.18) on the space H f k (t) we get in an obvious notation and taking the expectations we obtain (2.20) Again, identifiability of the system parameters can been shown if we show that the is non-singular. This is proved as follows: For k 0 = np − 1 it follows that T is generically non-singular since Z g 0 has generically full row rank (see Koelbl 2015). Using (2.20), a consistent estimation procedure is obtained analogously to the stock case described above. The innovation covariance matrix ν can be estimated as in Koelbl et al. (2016).

Simulations
In this section we present a simulation study comparing the accuracy of IVL with the accuracy of the XYW estimator and comparing these procedures as initial estimators for the EM algorithm. We consider the following data generating processes corresponding to the following two models: Example 1 Model 1 (which was also presented in Koelbl et al. (2016)) is of the form:  In both cases the innovations are standard normally distributed, i.e. ν t ∼ N (0, I i ), i = 2, 3.
The simulation study reports the mean squared errors for the parameters θ = vec (A 1 ) and θ = vec (A 1 , A 2 ), respectively. The sample size is T = 500 and we performed m = 10 3 simulation runs. Only the case of stock variables has been considered. We put N = 2 and n s = 1. The following estimation procedures are compared in this study: The Yule-Walker estimator obtained from highfrequency data, denoted by HF-YW. This estimator serves as an overall benchmark and therefore also the mean squared errors relative to the mean squared errors of the HF-YW estimators are presented. By MF-XYW we denote the mixed-frequency XYW estimator, by MF-IVL the mixed-frequency estimator introduced in the paper. By MF-EM-XYW we denote the mixed-frequency EM algorithm initialized with the XYW estimator and MF-EM-IVL the mixed-frequency EM algorithm initialized with the MF-IVL estimator, respectively. Table 1 summarizes the results.
Note that for the two models MF-IVL outperforms MF-XYW as far as the overall mean squared errors are concerned. This also holds for the estimators for the individual system-as well as for the corresponding estimates of the noise parameters. When used as initial estimators, again, MF-IVL outperforms MF-XYW. In addition, the number of iterations for the EM algorithm decreases for both models when initialized with the MF-IVL instead of the MF-XYW.

Conclusions
This paper proposes a new estimation procedure in the framework of VAR models and mixed-frequency data. The procedure is obtained by creating instrumental variables by projecting the slow variables on present and past fast ones. We show generic consistency of the system parameters for stock and flow variables. Simulations are presented to compare the properties of our procedure compared to the XYW estimator.
Both procedures are less accurate when compared to the MLE, our procedure however outperforms the XYW estimator.