Determining the number of factors after stationary univariate transformations

Corona, Francisco; Poncela, Pilar; Ruiz, Esther

doi:10.1007/s00181-016-1158-5

Determining the number of factors after stationary univariate transformations

Open access
Published: 27 September 2016

Volume 53, pages 351–372, (2017)
Cite this article

Download PDF

You have full access to this open access article

Empirical Economics Aims and scope Submit manuscript

Determining the number of factors after stationary univariate transformations

Download PDF

Francisco Corona¹,
Pilar Poncela^2,3 &
Esther Ruiz¹

1973 Accesses
10 Citations
Explore all metrics

Abstract

A very common practice when extracting factors from non-stationary multivariate time series is to differentiate each variable in the system. As a consequence, the ratio between variances and the dynamic dependence of the common and idiosyncratic differentiated components may change with respect to the original components. In this paper, we analyze the effects of these changes on the finite sample properties of several procedures to determine the number of factors. In particular, we consider the information criteria of Bai and Ng (Econometrica 70(1):191–221, 2002), the edge distribution of Onatski (Rev Econ Stat 92(4):1004–1016, 2010) and the ratios of eigenvalues proposed by Ahn and Horenstein (Econometrica 81(3):1203–1227, 2013). The performance of these procedures when implemented to differentiated variables depends on both the ratios between variances and dependencies of the differentiated factor and idiosyncratic noises. Furthermore, we also analyze the role of the number of factors in the original non-stationary system as well as of its temporal and cross-sectional dimensions. Finally, we implement the different procedures to determine the number of common factors in a system of inflation rates in 15 euro area countries.

The Structure of Generalized Linear Dynamic Factor Models

Simultaneous Statistical Inference in Dynamic Factor Models

Diagnostic Checks in Multiple Time Series Modelling

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years, due to the availability of data on a vast number of macroeconomic and financial variables, there has been an increasing interest in modeling large systems of economic time series. In order to reduce the dimensionality and extract the underlying factors, one can use dynamic factor models (DFMs), originally introduced in economics by Geweke (1977) and Sargent and Sims (1977). The aim of DFMs is to represent the dynamics of the system through a small number of hidden common factors which are mainly used for forecasting and macroeconomic policy-making; see Stock and Watson (2011) and Breitung and Choi (2013), for recent reviews of the existing literature. Kajal Lahiri has contributed to the DFMs literature with several empirical works. For example, Lahiri and Yao (2004) implement a DFM to analyze the business cycle features of the transportation sector and Lahiri and Sheng (2010) to measure the forecast uncertainty by disagreement. Lahiri et al (2015) also implement a DFM to a real-time jagged-edge data set of over 160 explanatory variables to re-examine the role of consumer confidence surveys in forecasting personal consumption expenditure. The properties of many popular factor extraction procedures rely on the number of factors in the system being known. However, in practice, the number of factors is unknown and needs to be determined. Among the most popular procedures proposed with this purpose are the criteria proposed by Bai and Ng (2002), which are now standard in the literature. These criteria are based on modifications of the Akaike information criteria (AIC) and Bayesian information criteria (BIC) taking into account the cross-sectional and temporal dimensions of the dataset as arguments of the function penalizing overparametrization. Alternatively, Onatski (2010) proposes an estimator of the number of factors based on using differences between adjacent eigenvalues of the sample covariance matrix of the variables contained in the system, arranged in descending order, while Ahn and Horenstein (2013) propose two alternative estimators based on ratios of adjacent eigenvalues.

It is well known that macroeconomic time series are frequently non-stationary and possibly cointegrated. Within the context of principal components (PC) factor extraction, and following Stock and Watson (2002), the most popular way of dealing with large systems of non-stationary macroeconomic variables is by differencing the variables in a univariate fashion; see, for example, Breitung and Eickmeier (2011); Stock and Watson (2012a, (2012b); Barhoumi et al (2013); Buch et al (2014); Moench et al (2013); Bräuning and Koopman (2014); Poncela et al (2014) and Jungbacker and Koopman (2015) for recent references. The theoretical justification of this extended practice is analyzed in Bai and Ng (2004) who show that applying PC to first-differenced data and recovering the original factors by “recumulating” is consistent regardless of whether the factors and/or idiosyncratic errors are I(0) or I(1)^{Footnote 1}. However, their theory proceeds assuming that the number of common factors in the system is known. On the other hand, as mentioned above, macroeconomic variables are not only non-stationary but can also be cointegrated. Differencing a cointegrated system may distort the determination of the number of factors due to the introduction of non-invertible moving average (MA) components and/or the trade-off introduced between the variances of the common and idiosyncratic components. Surprisingly, there has been little discussion in the literature on whether differencing in a univariate fashion affects the correct determination of the number of factors. As far as we know, only Bai (2004) analyzes the performance of the information criteria proposed by Bai and Ng (2002) when implemented to differenced data. In his Monte Carlo experiments, carried out for a unique DFM with contemporaneously uncorrelated idiosyncratic noises following an ARMA model and two random walk factors, he shows that the number of factors is correctly determined.

The main objective of this paper is to fill this gap by analyzing the effects of univariate stationary transformations of cointegrated systems when determining the number of factors using the approaches proposed by Bai and Ng (2002); Onatski (2010) and Ahn and Horenstein (2013). In the context of a DFM with mutually uncorrelated and homoscedastic idiosyncratic noises, we first derive analytically the eigenvalues of the covariance matrix and show how they are affected by univariate differentiation. We also carry out Monte Carlo experiments considering several designs selected to represent different situations that can be potentially encountered when dealing with the empirical analysis of real macroeconomic variables. Finally, we illustrate the results determining the number of factors in a system of prices of the euro area. It is important to note that the procedures for determining the number of factors considered in this paper are designed for what is known in the literature as static factors. Alternatively, several factor determination procedures have been proposed in the context of dynamic factors; see, for example Amengual and Watson (2007); Hallin and Liska (2007); Bai and Ng (2007); Jacobs and Otter (2008) and Breitung and Pigorsch (2013). The difference between static and dynamic factors is described by, for example, Bai and Ng (2008). They argue that, although dynamic factors can be useful to establishing the number of primitive shocks in the economy, the properties of estimated static factors are better understood from a theoretical point of view. Furthermore, we focus the analysis on procedures to detect the number of static factors as they are more popular in empirical economics.

The rest of this paper is structured as follows. In Sect. 2, we briefly describe the stationary DFM and the factor determination approaches considered. In Sect. 3, we analyze the effects of transforming non-stationary systems by univariate stationary transformations on these procedures. In Sect. 4, we report the results of the Monte Carlo experiments carried out to illustrate their finite sample performance. In Sect. 5, we carry out an empirical application. Finally, we conclude in Sect. 6.

2 The stationary dynamic factor model

In this section, we introduce notation and the stationary DFM and describe the factor determination procedures considered.

2.1 The model

We consider a DFM with cross-sectional dimension N, where the unobserved $r<N$ common factors, $F_{t}=(F_{1t},\dots ,F_{rt}){^{\prime }}$, and the idiosyncratic noises, $\varepsilon _{t}=(\varepsilon _{1t},\dots ,\varepsilon _{Nt}){^{\prime }}$, follow VAR(1) processes. The factors explain the common evolution of a vector of time series, $Y_{t}=(y_{1t},\dots ,y_{Nt}){^{\prime }}$ observed from $t=1,\dots ,T$. The basic DFM considered is given by

$$\begin{aligned} Y_{t}= & {} PF_{t}+\varepsilon _{t}, \end{aligned}$$

(1)

$$\begin{aligned} F_{t}= & {} \varPhi F_{t-1}+\eta _{t}, \end{aligned}$$

(2)

$$\begin{aligned} \varepsilon _{t}= & {} \varGamma \varepsilon _{t-1}+a_{t}, \end{aligned}$$

(3)

where the factor disturbances, $\eta _{t}=(\eta _{1t},\dots ,\eta _{rt}){^{\prime }}$, are $r\times 1$ vectors, distributed independently from the idiosyncratic noises for all leads and lags. Furthermore, $\eta _{t} $ and $a_{t}$ are Gaussian white noises with positive definite covariance matrices $\varSigma _{\eta }$ and $\varSigma _{a}$, respectively, and $P=({p^{\prime }}_{1},\dots ,{p^{\prime }}_{N}){^{\prime }}$, is the $N\times r$ matrix of factor loadings, where, $p_{i}=(p_{i1},\dots ,p_{ir})$. Finally, $\varPhi =\text {diag}(\phi _{1},\dots ,\phi _{r})$ and $\varGamma $ are $r\times r$ and $N\times N$ matrices containing the autoregressive parameters of the factors and the idiosyncratic components, respectively. These autoregressive matrices satisfy the usual stationarity assumptions. Furthermore, we assume that the structure of the idiosyncratic noises is such that they are weakly correlated. Following Bai and Ng (2002); Onatski (2012, (2015) and Ahn and Horenstein (2013), we consider the entries in P, $\varPhi $, $\varSigma _{\eta },$ $\varGamma $ and $\varSigma _{a}$ as fixed parameters. Jungbacker and Koopman (2015) and Alvarez et al (2016) implement the DFM in Eqs. (1) to (3) to the data set of Stock and Watson (2005).

The DFM in Eqs. (1) to (3) is not identified because, for any $r\times r$ nonsingular matrix H, the system can be expressed in terms of a new loading matrix and a new set of common factors. A normalization is necessary to solve this identification problem and uniquely define the factors. In the context of PC factor extraction, it is common to impose the restriction $P{^{\prime }}P/N=I_{r}$ and $FF{^{\prime }}$ being diagonal, where $F=(F_{1},\dots ,F_{T})$ is a $r\times T$ matrix of common factors; see Stock and Watson (2002); Bai and Ng (2002, (2008, (2013); Connor and Korajczyk (2010) and Bai and Wang (2014) for papers dealing with identification issues. Note that these are normalization restrictions, and they may not have an economic interpretation.

2.2 Determining the number of factors

The DFM described above assumes that the number of factors, r, is known. However, in practice, it needs to be estimated. Obtaining the correct value of r is crucial for an adequate estimation of the space spanned by the factors. There are several alternative procedures designed to determine r in DFMs. In this paper, we consider the information criteria proposed by Bai and Ng (2002) and the estimators proposed by Onatski (2010) and Ahn and Horenstein (2013).^{Footnote 2}

2.2.1 The Bai and Ng (2002) information criteria

The most popular information criteria to select the number of factors in DFMs, proposed by Bai and Ng (2002), are based on a consistent PC estimator of P and $F_{t}$ which is given by the solution to the following least squares problem

$$\begin{aligned} \min _{F_{1},\dots ,F_{T},P}V_{r}(P,F) \end{aligned}$$

(4)

subject to $P{^{\prime }}P/N=I_{r}\ \text {and}\ \textit{FF}{^{\prime }}\ \text {being diagonal,}$ where

$$\begin{aligned} V_{r}(P,F)=\frac{1}{NT}\sum _{t=1}^{T}(Y_{t}-\textit{PF}_{t}){^{\prime }}(Y_{t}-\textit{PF}_{t})=\frac{1}{NT}\sum _{t=1}^{T}\sum _{i=1}^{N}\varepsilon _{it}^{2}=\frac{1}{NT}tr(\varepsilon \varepsilon {^{\prime }}),\quad \end{aligned}$$

(5)

where $\varepsilon =(\varepsilon _{1},\dots ,\varepsilon _{T})$ has dimension $N\times T$. The solution to (4) is obtained by setting $\hat{P}$ equal to $\sqrt{N}$ times the eigenvectors corresponding to the r largest eigenvalues of $YY{^{\prime }}$ where $Y=(Y_{1},\dots ,Y_{T})$. The corresponding PC estimator of F is given by $\hat{F}=N^{-1}\hat{P}{^{\prime }}Y.$

PC factor extraction separates the common component, $PF_{t}$, from the idiosyncratic noises by averaging cross-sectionally the variables within $Y_{t}$ such that when N and T tend simultaneously to infinity, the weighted averages of the idiosyncratic noises converge to zero, remaining only the linear combinations of the factors. Therefore, it requires that the cumulative effects of the common component increase proportionally with N, while the eigenvalues of $\varSigma _{\varepsilon }=E(\varepsilon _{t}\varepsilon _{t}{^{\prime }})$ remain bounded; see the review of Breitung and Choi (2013) for a description of these conditions^{Footnote 3}. Bai (2003) proves that the PC estimators of factors, factor loadings and common components are asymptotically equivalent to the maximum likelihood estimators and, consequently, consistent. Also, he derives the rate of convergence and their corresponding limiting distributions when N and T tend simultaneously to infinity.

In order to determine r, Bai and Ng (2002) propose minimizing the following functions with respect to k, for $k=0,\dots ,r_{\max },$

$$\begin{aligned} \textit{IC}_{1}(k)&=\ln V_{k}(\hat{P},\hat{F})+k\frac{N+T}{NT}\ln \frac{NT}{N+T}, \end{aligned}$$

(6a)

$$\begin{aligned} \textit{IC}_{2}(k)&=\ln V_{k}(\hat{P},\hat{F})+k\frac{N+T}{NT}\ln m, \end{aligned}$$

(6b)

$$\begin{aligned} \textit{IC}_{3}(k)&=\ln V_{k}(\hat{P},\hat{F})+k\frac{\ln m}{m}, \end{aligned}$$

(6c)

where $V_{k}(\hat{P},\hat{F})$ is defined as in expression (5) with P and $F_{t}$ substituted by their respective PC estimates, $m=\min \left\langle N,T\right\rangle $ and $r_{\max }$ is a bounded integer such that $r\le r_{\max }$. The criteria in (6) are quite sensitive to the choice of $r_{\max }$; see the Monte Carlo results in Ahn and Horenstein (2013); Bai and Ng (2002) use $r_{\max }=8$ in their Monte Carlo experiments. On the other hand, in the context of first-differenced data, Bai and Ng (2004) use ${\textit{IC}}_{1}(k)$, with $r_{\max }=6$. Under appropriate assumptions, Bai and Ng (2002) prove the consistency of the information criteria above to determine the number of common factors.

If $\hat{\varepsilon }_{t}=Y_{t}-{\hat{P}}{\hat{F}}_{t}$ are the residuals of the regression of the variables in Y on the r first principal components of $\frac{1}{NT}YY{^{\prime }}$, then $tr(\hat{\varepsilon }\hat{\varepsilon }{^{\prime }})=tr(YY{^{\prime }})-tr(\hat{P}\hat{F}\hat{F}{^{\prime }}\hat{P}{^{\prime }})=T\sum _{i=1}^{m}\hat{\lambda }_{i}-T\sum _{i=1}^{r}\hat{\lambda }_{i}=T\sum _{i=r+1}^{m}\hat{\lambda }_{i},$ where $\hat{\lambda }_{i}$, $i=1,\dots ,m$ are the eigenvalues of $\hat{\varSigma }_{Y}=\frac{1}{T}YY{^{\prime }},$ arranged in descending order. Therefore,

$$\begin{aligned} V_{r}(\hat{P},\hat{F})=\frac{1}{N}\sum _{i=r+1}^{m}\hat{\lambda }_{i}. \end{aligned}$$

(7)

Using the expression of $V_{k}(\hat{P},\hat{F})$ in (7), the functions in (6) can be written as

$$\begin{aligned} {\textit{IC}}{j}(k)=\ln \left( \frac{1}{N}\sum _{i=k+1}^{m}\hat{\lambda }_{i}\right) +kg_{j}(N,T), \end{aligned}$$

(8)

where $g_{j}(N,T)$ is defined accordingly to the criteria in (6) for $j=1,$ 2 and 3.

2.2.2 Differenced eigenvalues

Onatski (2010) proposes an alternative procedure to select r, called edge distribution (ED), and shows that it outperforms the criteria proposed by Bai and Ng (2002) when the proportion of the variance attributed to the factors is small relative to the variance due to the idiosyncratic noises or when these are substantially correlated. Furthermore, computationally, the procedure proposed by Onatski (2010) allows the determination of the number of factors without previous estimation of the common component. Finally, it relaxes the standard assumption of PC factor extraction about the r eigenvalues of $\hat{\varSigma }_{Y}$ growing proportionally to N. Instead of requiring that the cumulative effect of factors grow as fast as N , Onatski (2010) imposes a structure on the idiosyncratic noises. Under the assumption of Normality, both cross-sectional and temporal dependence are allowed. This procedure is based on determining a sharp threshold, $\delta $, which consistently separates the bounded and diverging eigenvalues of $\hat{\varSigma }_{Y}$. For any $j>r$, the differences $\hat{\lambda }_{j}-\hat{\lambda }_{j+1}$ converge to 0 while the difference $\hat{\lambda }_{r}-\hat{\lambda }_{r+1}$ diverges to infinity when both N and T tend to infinity. Assuming that $r_{\max }/N\rightarrow 0$, Onatski (2010) proposes the following algorithm in order to calibrate $\delta $ and determine the number of factors:

1.
Obtain $\hat{\lambda }_{i}$, $i=1,...,N$ and set $j=r_{\max }+1$.
2.
Obtain $\hat{\beta }$ as the ordinary least squares (OLS) estimator of the slope of a simple linear regression with constant, where the observations of the dependent variable are $\left\{ \hat{\lambda }_{j},\dots ,\hat{\lambda }_{j+4}\right\} $ and the observations of the regressor variable are $\lbrace (j-1)^{2/3},\dots ,$ $(j+3)^{2/3}\rbrace $
3.
Estimate $\hat{r}=\max \{k\le r_{\max }|\hat{\lambda }_{k}-\hat{\lambda }_{k+1}\ge \hat{\delta }\}$ or $\hat{r}=0$ if $\hat{\lambda }_{k}-\hat{\lambda }_{k+1}<\hat{\delta }$.
4.
Set $j=\hat{r}+1$. Repeat steps 2 and 3 until $\hat{r}$ converges.

Under suitable conditions, Onatski (2010) proves the consistency of $\hat{r}$ for any fixed $\delta >0$. He sets the number of iterations to four although the convergence of the above algorithm is often achieved at the second iteration. Additionally, he sets $r_{\max }=8$ when $r=1,2,5$ and $r_{\max }=20$ when $r=15$.

2.2.3 Ratios of eigenvalues

Recently, Ahn and Horenstein (2013) propose two further estimators of the number of factors based on the fact that the r largest eigenvalues of $\hat{\varSigma }_{Y}$ grow unbounded as N increases, while the other eigenvalues remain bounded. They show that these estimators are less sensitive to the choice of $r_{\max }$ than those based on the Bai and Ng (2002) information criteria. The two new estimators are defined as the value of k, for $k=0,\dots ,r_{\max },$ that maximizes the following ratios

$$\begin{aligned} \textit{ER}(k)= & {} \frac{\hat{\lambda }_{k}}{\hat{\lambda }_{k+1}}, \end{aligned}$$

(9)

$$\begin{aligned} \textit{GR}(k)= & {} \frac{\ln \left[ V_{k-1}(\hat{P},\hat{F})/V_{k}(\hat{P},\hat{F})\right] }{\ln \left[ V_{k}(\hat{P},\hat{F})/V_{k+1}(\hat{P},\hat{F})\right] }=\frac{\ln (1+\hat{\lambda }_{k}^{*})}{\ln (1+\hat{\lambda }_{k+1}^{*})}, \end{aligned}$$

(10)

where $\hat{\lambda }_{0}=\frac{1}{m}\sum _{k=1}^{m}\hat{\lambda }_{k}/\ln (m)$ and $\hat{\lambda }_{k}^{*}=\hat{\lambda }_{k}/\sum _{j=k+1}^{m}\hat{\lambda }_{j}$. The value of $\hat{\lambda }_{0}$ has been chosen following the definition of Ahn and Horenstein (2013) according to which $\hat{\lambda }_{0}\rightarrow 0$ and $m\hat{\lambda }_{0}\rightarrow \infty $ as $m\rightarrow \infty .$ ^{Footnote 4}

Note that both the numerator and denominator of GR(k) are the growth rates of sums of residual variances computed with j and $j+1$ factors. Ahn and Horenstein (2013) show that, contrary to the estimator proposed by Bai and Ng (2002), their estimators are not dependent on $r_{\max }$ and suggest to chose it as $\min (r_{\max }^{*},0.1m)$ where $r_{\max }^{*}=\#\left\{ k\mid N^{-1}\hat{\lambda }_{k}\ge V_{0}/m,k\ge 1\right\} $. Under the same assumptions of Bai and Ng (2006) and Onatski (2010), and allowing for some variables in Y to be perfectly multicollinear or with zero idiosyncratic variances, they establish consistency of the $\textit{ER}(k)$ and $\textit{GR}(k)$ estimators. The results obtained in their Monte Carlo analysis show that the two estimators outperform the Bai and Ng (2002) information criteria and Onatski (2010) estimator mainly when the idiosyncratic components are simultaneously cross-sectionally and serially correlated. However, the estimator proposed by Onatski (2010) outperforms the $\textit{ER}(k)$ and $\textit{GR}(k)$ ratios when the variance of the idiosyncratic component is larger than that of the common component (weak factors).

2.3 A note on the convergence of eigenvalues

The procedures to determine the number of common factors described above are based on the eigenvalues of the sample covariance matrix, $\hat{\varSigma }_{Y}$. One of the main contributions of Bai and Ng (2002) is to show that the convergence of the eigenvalues of $\frac{1}{TN}YY{^{\prime }}$ depends on m. Later, Kapetanios (2010) reviews the available literature about the topic pointing out that the distribution of the largest eigenvalue depends in complicated ways on the parameters of the model. It seems that serial correlation affects both the parameters of the asymptotic limits and their functional form. Furthermore, he shows that the first r eigenvalues of $\varSigma _{Y}$ increase at rate N which follows from the fact that the r largest eigenvalues of $F{^{\prime }}F$ will grow at rate N as long as the loading matrix P is not sparse and suggests that it is reasonable to expect a similar behavior from the eigenvalues of the sample covariance matrix.

More recently, Onatski (2012, (2015) develops new asymptotics for the eigenvalues of the sample covariance matrix by considering that both the weights and the factors are fixed parameters.

3 Determining the number of factors after differencing

As mentioned in the Introduction, macroeconomic systems are often non-stationary. In this section, we analyze the effects on the performance of the number of factors determination procedures described above of transforming the data in a univariate fashion in order to achieve stationarity. Note that differencing affects the ratio between the variances of the factors and idiosyncratic components, the temporal dependence structure and the cross-correlations among the idiosyncratic noises.

Consider the DFM given in Eqs. (1) to (3) in which $\varPhi $ and $\varGamma $ are diagonal matrices which may have 1’s in the main diagonal. Consequently, both the factors and the idiosyncratic noises can be either stationary or non-stationary random walks. Under this specification, the system of first-differenced data satisfies all conditions of Bai and Ng (2002); Onatski (2010) and Ahn and Horenstein (2013). After differencing the data in a univariate fashion, the DFM takes the following form

$$\begin{aligned} \varDelta Y_{t}= & {} P\varDelta F_{t}+\varDelta \varepsilon _{t}, \end{aligned}$$

(11)

$$\begin{aligned} \varDelta F_{t}= & {} (\varPhi -I)F_{t-1}+\eta _{t}, \end{aligned}$$

(12)

$$\begin{aligned} \varDelta \varepsilon _{t}= & {} (\varGamma -I)\varepsilon _{t-1}+a_{t}. \end{aligned}$$

(13)

Denote by $\phi _{i}$ the i-th element in the main diagonal of $\varPhi .$ If $|\phi _{i}|<1$, then the variance of the corresponding differenced factor is given by $\sigma _{f_{i}}^{2}=2\sigma _{\eta _{i}}^{2}/(1+\phi _{i})$ where $\sigma _{\eta _{i}}^{2}$ is the variance of $\eta _{i}$. When $\phi _{i}=0.5$, the difference between the variances of $F_{t}$ and $\varDelta F_{t}$ is zero. Therefore, in this case, the variance of the factor is not changed after differencing the data. However, if $\phi _{i}<0.5$, the variance of $\varDelta F_{t}$ is larger than that of $F_{t}$ while if $\phi _{i}>0.5$, it is smaller. The same relation can be established for the variances of the elements in $\varepsilon _{t}$ and $\varDelta \varepsilon _{t}$ with respect to $\gamma _{i}$, the i-th element in the main diagonal of $\varGamma $. Note that if $\varepsilon _{t}$ is stationary, with autoregressive parameters smaller than 0.5 while $F_{t}$ is non-stationary, then overdifferencing the idiosyncratic components may introduce distortions on the determination of the number of factors given that the relation between the variances of the common and idiosyncratic components is modified with the variances of $\varDelta F_{t}$ being smaller and the variances of $\varDelta \varepsilon _{t}$ being larger. The dynamic dependence of the idiosyncratic noises of the differenced model are given by

$$\begin{aligned} Corr(\varDelta \varepsilon _{it},\varDelta \varepsilon _{it-h})= 0.5\gamma _i^{h-1}(\gamma _i - 1). \end{aligned}$$

Finally, note that differencing also affects the cross-correlations of the idiosyncratic noises. Consider, for example, that the correlation between $\varepsilon _{it}$ and $\varepsilon _{jt}$ is given by $\rho $. If the idiosyncratic noises are stationary, then

$$\begin{aligned} Corr(\varDelta \varepsilon _{it},\varDelta \varepsilon _{jt})=\sigma _{\varDelta \varepsilon _{i}}^{-1}\sigma _{\varDelta \varepsilon _{j}}^{-1}\left( 2-\gamma _{i}-\gamma _{j}\right) \rho \sigma _{\varepsilon _{i}}\sigma _{\varepsilon _{j}}=\frac{0.5\left( 2-\gamma _{i}-\gamma _{j}\right) \rho }{\sqrt{(1-\gamma _{i})(1-\gamma _{j})}}. \end{aligned}$$

In order to simplify the analysis of the effects of univariate differentiation on the determination of r, we consider $\varGamma =\gamma I$ and $\varSigma _{a}=\sigma _{a}^{2}I$, so that the idiosyncratic noises are homoscedastic and mutually uncorrelated and all of them are governed by the same autoregressive parameter. Given that there is no correlation between the factors and the idiosyncratic components, the covariance matrix of the first-differenced data is given by $\varSigma _{\varDelta Y}=P\varSigma _{f}P{^{\prime }}+\sigma _{e}^{2}I$, where $\varSigma _{f}$ is the covariance matrix of $\varDelta F_{t}$ and $\sigma _{e}^{2}=2\sigma _{a}^{2}/(1+\gamma )$ is the variance of each element in $\varDelta \varepsilon _{t}.$ The ordered eigenvalues of $\varSigma _{\varDelta Y}$ are equal to $\sigma _{e}^{2}+\mu _{i}$ for $i=1,\dots ,N$, where $\mu _{i}$ is the i-th largest eigenvalue of $P\varSigma _{f}P{^{\prime }}$. Furthermore, $tr\left( P\varSigma _{f}P{^{\prime }}\right) =tr\left( P{^{\prime }}P\varSigma _{f}\right) =\sum _{j=1}^{r}\sigma _{f_{j}}^{2}\sum _{i=1}^{N}p_{ij}^{2}=\sum _{j=1}^{r}\mu _{j}$. Therefore, the sum of the r largest eigenvalues of $\varSigma _{\varDelta Y}$ is given by $\sum _{i=1}^{r}\lambda _{i}=r\sigma _{e}^{2}+\sum _{j=1}^{r}\sigma _{f_{j}}^{2}\sum _{i=1}^{N}p_{ij}^{2}$, while the rest $N-r$ eigenvalues are given by $\lambda _{i}=\sigma _{e}^{2}.$

Consider the particular case of a unique random walk factor, i.e., $r=1$ and $\phi _{1}=1$. In this case, $\lambda _{1} =\sigma _{\eta }^{2}\sum _{i=1}^{N}p_{i1}^{2}+\sigma _{e}^{2}$ and $\lambda _{i}=\sigma _{e}^{2},$ for $i=2,\dots ,N.$ Consequently, the function to be minimized according to the Bai and Ng (2002) information criteria, is given by

$$\begin{aligned} \textit{IC}(k)=\left\{ \begin{array}{l@{\quad }l} \ln \left( N^{-1}\sigma _{\eta }^{2}\sum _{i=1}^{N}p_{i1}^{2}+\sigma _{e}^{2}\right) , &{} k=0 \\ \ln (N-k)-\ln (N)+\ln (\sigma _{e}^{2})+kg(N,T), &{} k\ge 1.\end{array}\right. \end{aligned}$$

The procedure proposed by Onatski (2010) is based on the differences between adjacent eigenvalues. Note that for $j=2,\dots ,N$, $\lambda _{j}-\lambda _{j+1}=0$. Therefore, the procedure should work as far as the difference between $\lambda _{1}$ and $\lambda _{2}$ is large. This difference is given by $\lambda _{1}-\lambda _{2}=\sigma _{\eta }^{2}\sum _{i=1}^{N}p_{i1}^{2}$ and does not depend on the value of $\sigma _{e}^{2}$. Therefore, for given weights and cross-sectional dimension, the procedure should work better when $\sigma _{\eta }^{2}$ is large. Also, for a given value of $\sigma _{\eta }^{2},$ the procedure should work better as N increases. Note that in the first step of the algorithm proposed by Onatski (2010), $\hat{\delta }=0$ because for $j=r_{\max }+1$ eigenvalues $\lambda _{j}$ are always $\sigma _{e}^{2}$.

Consider the ER(k) criterion of Ahn and Horenstein (2013) given in (9) which looks for a large difference between the ratio of $\lambda _{1}$ and $\lambda _{2}$ with respect to the ratios between other adjacent eigenvalues. Note that, in the particular case we are considering, if $N < T$, the mock eigenvalue is given by $\lambda _{0}= \ln (N)^{-1}\left( \sigma _{e}^{2}+N^{-1}\sigma _{\eta }^{2}\sum _{i=1}^{N}p_{i1}^{2}\right) $, and, consequently,

$$\begin{aligned} ER(k)=\left\{ \begin{array}{l@{\quad }l} \frac{1+N^{-1}q\sum _{i=1}^{N}p_{i1}^{2}}{\ln (N)\left( 1+q\sum _{i=1}^{N}p_{i1}^{2}\right) }, &{} k=0 \\ 1+q\sum _{i=1}^{N}p_{i1}^{2}, &{} k=1 \\ 1, &{} k\ge 2, \end{array}\right. \end{aligned}$$

where $q=\frac{\sigma _{\eta }^{2}(1+\gamma )}{2\sigma _{a}^{2}}.$ Note that if N is large enough, ER(0) should be close to 0. Therefore, for given weights, the criteria should work better when q is larger.

Finally, consider the GR(k) criterion of Ahn and Horenstein (2013). In this case, note that

$$\begin{aligned} \lambda _{i}^{*}=\left\{ \begin{array}{l@{\quad }l} (N\ln (N))^{-1}, &{} i=0 \\ (N-1)^{-1}(q\sum _{i=1}^{N}p_{i1}^{2}+1), &{} i=1 \\ (N-i)^{-1}, &{} i\ge 2. \end{array}\right. \end{aligned}$$

Therefore,

$$\begin{aligned} \frac{\ln (1+\lambda _{k}^{*})}{\ln (1+\lambda _{k+1}^{*})}=\left\{ \begin{array}{l@{\quad }l} \frac{\ln (N\ln N+1)-\ln (N\ln N)}{\ln (N + \sum _{i=1}^Np_{i1}^2) - \ln (N-1)}, &{} k=0 \\ \frac{\ln (N + \sum _{i=1}^Np_{i1}^2) - \ln (N-1)}{\ln (N-1) - \ln (N-2)},&{} k=1 \\ \frac{\ln (N+1-k)-\ln (N-k)}{\ln (N-k)-\ln (N-k-1)}, &{} k\ge 2.\end{array}\right. \end{aligned}$$

4 Finite sample performance

The results in the previous section are based on population covariance matrices and their corresponding eigenvalues. However, in practice, when determining the number of common factors in empirical applications, one should estimate the covariance matrix by its sample version and obtain the corresponding estimated eigenvalues. As mentioned above, the asymptotic distribution of estimated eigenvalues is complicated and not always known. The finite sample properties of the estimated eigenvalues depend on the temporal sample size used for their estimation, T, the cross-sectional dimension, N, the ratio between the variances of the common and idiosyncratic components and the structure of the temporal and cross-sectional dependencies of the idiosyncratic noises. In this section, we carry out Monte Carlo experiments in order to analyze how the determination of the number of factors is affected by univariate differentiation of non-stationary data when implemented in finite samples. We should note that the procedures considered have been developed for N and T going to infinity. However, when the procedures are implemented in practice, both N and T are finite. Our interest in this paper is to study the performance of the criteria under different combinations of N and T similar to those often encountered when dealing with systems of macroeconomic and financial variables. Furthermore, we want to investigate how small N and T can be for the procedures to be reliable under different structures of the factors and idiosyncratic noises. In this way, our results can be of interest for practitioners in empirical applications.

The experiments are based on $R=500$ replications generated by the DFM in Eqs. (1) to (3) with $N=(12, 50, 100, 200)$ and $T=(100, 500)$ ^{Footnote 5}. Our simulations are categorized into two parts. The first part is designed to investigate how the alternative estimators considered behave when detecting a unique random walk factor under different temporal and cross-sectional structures of the idiosyncratic noises. The second part is designed to analyze models with more than one factor.

Consider first a DFM defined as in Eqs. (1) to (3) with $r=1$, $\varPhi =1$ and $\sigma _{\eta }^{2}=1.$ The factor loadings are generated by $p_{i1}\thicksim U\left[ 0,1\right] $ with $\sum _{i=1}^{N}p_{i1}^{2}=5.59$, 18.70, 34.63 and 65.56 for $N=12$, 50, 100 and 200, respectively; Bai and Ng (2006) and Poncela and Ruiz (2016) also generate the factor loadings by the same distribution. We consider several structures for the idiosyncratic noises. First, the idiosyncratic noises are mutually uncorrelated and homoscedastic. In particular, the autoregressive coefficient matrix of the idiosyncratic components is diagonal, $\varGamma =\gamma I,$ with $\gamma =(-0.8,$ 1) and $\varSigma _{a}=\sigma _{a}^{2}I$ with $\sigma _{a}^{2}=1$ so that $\sigma _{e}^{2}=10$ and 1 for the values of $\gamma $ considered. Note that, differently from simulations carried out in related works, we consider both positive and negative values for the autoregressive parameter of the idiosyncratic noises; see, Pinheiro et al (2013) who estimate correlations for $\varDelta \varepsilon _{t}$ between -0.6 and 0.9 when dealing with the U.S. monthly macroeconomic data set of Stock and Watson (2005). In order to separate the effects of the temporal dependence and the variance of the differenced idiosyncratic noises on the results, we also consider the combinations $\gamma =-0.8$ and $\sigma _{a}^{2}=0.1$ ($\sigma _{e}^{2}=1)$ and $\gamma =1$ and $\sigma _{a}^{2}=10$ ($\sigma _{e}^{2}=10).$ We introduce contemporaneous correlations among the idiosyncratic noises. $\varSigma _a$ is generated with $\sigma _{a}^{2}=0.1,$ 1 and 10 in the main diagonal and, following Onatski (2012), a Toeplitz structure with parameter $b=0.5.$ Finally, we consider models with heteroscedastic idiosyncratic noises. The variances are generated by $\sigma _{a_{i}}^{2}\thicksim U\left[ 0.5,1.5\right] ,$ $\sigma _{a_{i}}^{2}\thicksim U\left[ 0.05,0.15\right] $ and $\sigma _{a_{i}}^{2}\thicksim U\left[ 5,15\right] $; see Bai and Ng (2006) and Breitung and Eickmeier (2011) for the same design to simulate heteroscedastic idiosyncratic noises. In these two latter cases, we consider $\gamma =-0.8$ and 1.

For each replica, we generate observations $Y_{t}$ and differentiate the data in a univariate fashion. Then, the eigenvalues of the sample covariance matrix of $\frac{1}{T-1}(\varDelta Y)(\varDelta Y){^{\prime }}$ are computed and r is determined using each of the procedures described above with $r_{\max }=4$, 7 and 13 when $N=12$, 50 and 200, respectively^{Footnote 6}. The number of factors determined using the three criteria proposed by Bai and Ng (2002) are denoted by $\hat{r}_{{\textit{IC}}{1}}$, $\hat{r}_{{\textit{IC}}{2}}$, $\hat{r}_{{\textit{IC}}{3}}$, while the number of factors determined implementing the procedure due to Onatski (2010) is denoted by $\hat{r}_{\textit{ED}}$. Finally, the number of factors estimated using the two ratios proposed by Ahn and Horenstein (2013) are denoted by $\hat{r}_{\textit{ER}}$ and $\hat{r}_{GR}.$

Figure 1 plots, for $N=12$ and $T=100,$ the Monte Carlo averages and 95 % confidence intervals, for homoscedastic and contemporaneously uncorrelated idiosyncratic noises^{Footnote 7}, of i) the sample ordered eigenvalues; ii) their differences; and iii) their ratios, together with the corresponding population quantities, when $\gamma =-0.8$ and $\sigma _{a}^{2}=0.1$, $\gamma =1$ and $\sigma _{a}^{2}=1$, $\gamma =-0.8$ and $\sigma _{a}^{2}=1$ and $\gamma =1$ and $\sigma _{a}^{2}=10$. When the idiosyncratic noises are homoscedastic and white noise, according to the results in previous section, the largest eigenvalue of the population covariance matrix of $\varDelta Y$ is given by $\lambda _{1}=\sigma _{e}^{2}+\sum _{i=1}^{N}p_{i1}^{2}$ while all other eigenvalues are given by $\sigma _{e}^{2}$. Note that in the first two cases, $\sigma _{e}^{2}=1$ and the population eigenvalues are equal. In the two latter cases, $\sigma _{e}^{2}=10.$ Figure 1 shows that, regardless of the value of $\sigma _{e}^{2},$ the eigenvalues are better estimated when $\gamma =1$ than when $\gamma =-0.8$, with smaller biases and standard deviations. Obviously, given $\gamma ,$ the eigenvalues are better estimated when $\sigma _{a}^{2}$ is smaller. Therefore, in order to estimate the eigenvalues of the covariance matrix of $\varDelta Y,$,important not only the relative variance of the differenced idiosyncratic noises but also their temporal dependence is important.

In order to analyze the separate effect of the cross-sectional and temporal dimensions of the system on the estimation of the eigenvalues, Fig. 2 plots the same quantities as in Fig. 1 for $\gamma =-0.8$ and $\sigma _{a}^{2}=1,$ when $N=12, 50$ and 200, and $T=100$ and 500. Note that when N increases, the first eigenvalue of the population covariance matrix is different and is estimated with larger biases and standard deviations. All other eigenvalues are also estimated with larger biases and standard deviations. Therefore, given T, increasing N could lead to an even worse estimation of the sample eigenvalues. However, as expected, given N, an increase in T leads to smaller biases and standard deviations of the estimated eigenvalues.

The finite sample properties of the estimated eigenvalues have effects on the properties of the procedures to detect the number of factors. Figure 3 plots, for each of the procedures considered, the percentage of replicates in which the estimated number of common factors is: (i) $\hat{r}=0$; (ii) $\hat{r}=r;$ (iii) $\hat{r}=r_{\max }$; and (iv) $\hat{r}>r$, when $\gamma =-0.8$ and $\sigma _{a}^{2}=0.1$ ($\sigma _{e}^{2}=1),$ when $N=12, 50$ and 200 and $T=100$ and 500. We consider idiosyncratic noises being homoscedastic and uncorrelated; heteroscedastic and uncorrelated; and homoscedastic and cross-sectionally correlated. We can observe that, regardless of the structure of the idiosyncratic noises and the cross-sectional dimension, when $T=100,$ the three information estimators tend to overestimate r and in most of the replicates $\hat{r}_{\textit{IC}}=r_{\max }.$ However, when $T=500$, the percentage of $\hat{r}_{\textit{IC}}=r$ is close to 100 % if the idiosyncratic errors are homoscedastic and cross-sectionally uncorrelated even if $N=12$. However, if there is cross-sectional correlation $\hat{r}_{\textit{IC}}=r_{\max }.$ On the other hand, increasing N leads to a larger percentage of $\hat{r}_{\textit{IC}}>r.$ The performance of the two estimators based on ratios of eigenvalues, $\hat{r}_{\textit{ER}}$ and $\hat{r}_{GR},$ is very similar and always better than that of the estimator based on differenced eigenvalues, $\hat{r}_{\textit{ED}}$. The percentages of correct estimation of r when implementing the $\hat{r}_{\textit{ER}}$ and $\hat{r}_{GR}$ estimators are close to 90 % when $N=12$ and $T=100$ and increase to 100 % when increasing either N or T. The results for heteroscedastic and cross-correlated idiosyncratic noises are very similar.

Figure 4 plots the same quantities as in Fig. 3 when $\gamma =1$ and $\sigma _{a}^{2}=1.$ Note that this case is comparable to that in Fig. 3 in the sense that the variance of the differenced idiosyncratic noises is the same, $\sigma _{e}^{2}=1,$ but the differentiated idiosyncratic noises are cross-sectionally uncorrelated white noises. We can observe that the performance of the alternative procedures to estimate r is rather different to that in Fig. 3. All procedures have correct estimations close to 100 % except the information criteria when $N=12$ and the idiosyncratic errors are cross-correlated. In this latter case, $\hat{r}_{\textit{IC}}=r_{\max }.$ Consequently, not only the variance of the differenced idiosyncratic noises but also its dependence structure have effects on the procedures to detect the number of factors. Only the $\hat{r}_{\textit{ER}}$ and $\hat{r}_{GR}$ estimators seem to be robust to them.

Finally, Fig. 5 considers the case when $\gamma =-0.8$ and $\sigma _{a}^{2}=1$ with $\sigma _{e}^{2}=10.$ In this case, the information criteria behave very similarly than when $\sigma _{a}^{2}=0.1$ and $T=100$ with $\hat{r}_{\textit{IC}}=r_{\max }.$ However, when $N= (12, 50)$ and $T=500$, the information criteria procedures estimate $\hat{r}_{\textit{IC}}=0$. Therefore, it seems that they are more affected by the temporal dependence of the differenced idiosyncratic noises than by their variance. On the other hand, when looking at the performance of $\hat{r}_{\textit{ER}}$ and $\hat{r}_{GR}$, we can observe that it clearly deteriorates when $\sigma _{e}^{2}=10.$ Therefore, their performance clearly depends on $\sigma _{e}^{2}.$ The behavior of $\hat{r}_{\textit{ED}}$ depends both on $\gamma $ and $\sigma _{e}^{2} $ with a rather large percentage of cases in which $\hat{r}_{\textit{IC}}=0.$

In the second part of the Monte Carlo experiments, we consider models in which $r=2.$ First, we consider a second non-stationary common factor, i.e., $\varPhi =I$ and $\varSigma _{\eta }=I$. Second, the covariance matrix of the factor disturbances is given by $\varSigma _{\eta } = \text {diag}(1, 5)$. Finally, the last model considered has a second stationary factor with $\varSigma _{\eta }=I$ and $\varPhi = \text {diag}(1, 0.5)$.

For each of the three Data Generating Process (DGP) above, Fig. 6 plots the percentages of (i) $\hat{r}=0$; (ii) $\hat{r}=1;$ (iii) $\hat{r}=r;$ (iv) $\hat{r}=r_{\max }$; and (v) $\hat{r}>r$, when $\gamma =-0.8$ and $\sigma _{a}^{2}=0.1$ ($\sigma _{e}^{2}=1)$ and for $N=12$ with $T=100$ and $N=200$ with $T=500$ ^{Footnote 8}. First of all, observe that when $N=12$ and $T=100,$ the information criteria chose $\hat{r}=r_{\max }$ in all cases. Increasing the dimensions of the system helps for $\hat{r}_{IC1}$ and $\hat{r}_{IC2}$ but not for $\hat{r}_{IC3}.$ When looking at the ED, ER and GR criteria, we can observe that, regardless of the structure of the two factors, when $N=200$ and $T=500$, all of them have percentages of determination of the true number of factors close to 100 %. However, when $N=12$ and $T=100$, there is a large percentage of replicates in which $\hat{r}=1.$ In this case, the ED procedure is better than the two procedures based on ratios. When the two common random walks in the original data have different variances, the ED procedure has an acceptable proportion of cases in which $\hat{r}=r.$

5 Empirical analysis

In this section, we implement the procedures considered in this paper to determine the number of common factors in a system of inflation rates in 15 euro area countries, namely, Austria (AUT), Belgium (BEL), Denmark (DEN), Finland (FIN), France (FRA), Germany (GER), Greece (GRE), Ireland (IRL), Italy (ITA), Luxemburg (LUX), Netherlands (NED), Portugal (POR), Spain (SPA), Sweden (SWE) and United Kingdom (UK). Prices, observed monthly from January 1996 to November 2015, $P_{it},$ have been obtained from the OCDE data base^{Footnote 9} and transformed into annual inflation as $y_{it}=100\times \varDelta _{12}\log (P_{it})$. When needed, the inflation rates have been corrected by outliers using the software developed by the United States Census Bureau^{Footnote 10}. Following Stock and Watson (2005), outliers are substituted by the median of the 5 previous observations. Furthermore, the inflation series have been deseasonalized when appropriate.^{Footnote 11}

Then, as in Reis and Watson (2010) and Altissimo et al (2009), we carry out the determination of the number of factors using both the inflation data in levels and after differencing. All procedures are implemented with $r_{\max }=5.$ Regardless of whether the procedures are implemented using the original or differenced inflation rates, the information criteria estimate $\hat{r}=5$ and $\hat{r}_{\textit{ER}}=\hat{r}_{GR}=1$. However, after differencing, the ED procedure detects just one factor while $\hat{r}_{\textit{ED}} = 3$ in the original inflation series. According to our Monte Carlo experiments, if the number of true factors is $r\ge 2$, then the ED, ER and GR procedures tend to detect $\hat{r}<r$ when implemented to differentiated data. Therefore, we could expect the true number of factors to be larger than one. Consequently, we extract the factors assuming that $r=3$ both from the original and differenced inflation series. In the latter case, the extracted factors are reaccumulated as proposed by Bai and Ng (2004). The extracted factors and their corresponding weights are plotted in Fig. 7; compare with the factor extracted by Delle Monache et al (2016) using quarterly inflation for a panel of 12 inflation rates from a sample of EMU countries. In Fig. 7, there are not significant differences between the factors estimated using the original and differenced inflation rates but for the centering of the latter. This result could be expected since the variances of all the idiosyncratic noises are rather small with values between 0.03 and 0.1. Consequently, the differenced idiosyncratic noises are white noises with small variances.

Finally, we should point out that the main difference between extracting factors assuming that $r = 1$ or $r = 3$ is the interpretability. Recall that PC consistently estimates the space spanned by the factors. Therefore, assuming that $r = 3$ we can obtain rotations that are not allowed when assuming that $r = 1$.

6 Conclusions

Differencing non-stationary cointegrated systems have effects on the properties of factor determination procedures. We show that both the variance and the dependence structure of the differenced idiosyncratic noises are important when measuring these effects. If $r=1$, the ER and GR procedures work well even in relatively small sizes under all the structures of the idiosyncratic noises considered in this paper. Only when the variance of the differenced idiosyncratic noises is very large with respect to the variance of the differenced factor, the performance is worse although better than the alternatives. However, the performance of all procedures deteriorates when $r=2$. In this case, the ED procedure seems to work better.

Notes

Bai (2004) also has asymptotic results for the factors estimated from the original non-stationary data.
Alternatively, based on the estimator proposed by Hallin and Liska (2007), Alessi et al (2010) propose a refinement of Bai and Ng (2002) criteria based on multiplying the penalty function by a constant that tunes the penalizing power of the function itself and estimating the number of factors using different subsamples. Also, Kapetanios (2010) proposes determining the number of factors using resampling to choose the normalizing constants to be used in order to have an asymptotic distribution for the eigenvalues of the sample covariance matrix of Y. Given that these procedures are very intensive computationally, we do not consider them further in this paper. Recently, Harding (2013) proposes a consistent procedure with improved finite sample properties when compared with Bai and Ng (2002) and Onatski (2010) in the presence of weak factors. Also Caner and Han (2014) propose a procedure based on a group bridge estimator, while Han and Caner (2016) propose a modification of the penalty function of Bai and Ng (2002) which is data dependent.
Onatski (2012) considers a DFM in which the explanatory power of the factors does not strongly dominate the explanatory power of the idiosyncratic noises.
Ideas similar to the ER estimator have also been considered by Luo et al (2009) and Wang (2012).
The time dimension of the multivariate system is generated with $T^{*}=T+100$ observations. The factor extraction is carried out after removing the first 100 observations.
It is important to note that Ahn and Horenstein (2013) recommend double demeaned the data for their estimators to have a better behavior. However, in our Monte Carlo experiments, we observe a deterioration of the performance of all criteria to determine the number of factors. Consequently, we compute the covariance matrix using the original differenced observations.
The effect of heteroscedasticity and weak cross-correlation on the estimated eigenvalues is negligible. The results are available upon request.
Monte Carlo results on the estimated eigenvalues are available upon request. They are not included to save space.
http://stats.oecd.org/index.aspx?queryid=221.
https://cran.r-project.org/web/packages/seasonal.pdf.
Camacho et al (2015) show that the performance of deseasonalized data is comparable to using non-seasonally adjusted data in the context of estimating factors with forecasting purposes. Previous deseasonalizing apparently provides the best of two worlds: not working with incorrect assumptions about common seasonality while keeping a limited number of parameters to be estimated.

References

Ahn S, Horenstein A (2013) Eigenvalue ratio test for the number of factors. Econometrica 81(3):1203–1227
Article Google Scholar
Alessi L, Barigozzi M, Capasso M (2010) Improved penalization for determining the number of factors in approximate factor models. Stat Probab Lett 80(1):1806–1813
Article Google Scholar
Altissimo F, Mojon B, Zaffaroni P (2009) Can aggregation explain the persistence of inflation? J Monet Econ 56:231–241
Article Google Scholar
Alvarez R, Camacho M, Perez-Quiros G (2016) Aggregate versus disaggregate information in dynamic factor models. Int J Forecast 32:680–694
Article Google Scholar
Amengual D, Watson MW (2007) Consistent estimation of the number of dynamic factors in a large N and T panel. J Bus Econ Stat 25(1):91–96
Article Google Scholar
Bai J (2003) Inferential theory for factor models of large dimensions. Econometrica 71(1):135–171
Article Google Scholar
Bai J (2004) Estimating cross-section common stochastic trends in nonstationary panel data. J Econom 122(1):137–183
Article Google Scholar
Bai J, Ng S (2002) Determining the number of factors in approximate factor models. Econometrica 70(1):191–221
Article Google Scholar
Bai J, Ng S (2004) A PANIC attack on unit roots and cointegration. Econometrica 72(4):1127–1177
Article Google Scholar
Bai J, Ng S (2006) Confidence intervals for diffusion index forecast and inference for factor-augmented regressions. Econometrica 74(4):1133–1150
Article Google Scholar
Bai J, Ng S (2007) Determining the number of primitive shocks in factor models. J Bus Econ Stat 25(1):52–60
Article Google Scholar
Bai J, Ng S (2008) Large dimensional factor analysis. Found Trends Econom 3(2):89–163
Article Google Scholar
Bai J, Ng S (2013) Principal components estimation and identification of static factors. J Econom 176(1):18–29
Article Google Scholar
Bai J, Wang P (2014) Identification theory for high dimensional static and dynamic factor models. J Econom 178(2):794–804
Article Google Scholar
Barhoumi K, Darné O, Ferrara L (2013) Testing the number of factors: an empirical assessment for forecasting purposes. Oxf Bull Econ Stat 75(1):64–79
Article Google Scholar
Bräuning F, Koopman SJ (2014) Forecasting macroeconomic variables using collapsed dynamic factor analysis. Int J Forecast 30(3):572–584
Article Google Scholar
Breitung J, Choi I (2013) Factor models. In: Hashimzade N, Thorthon MA (eds) Handbook of research methods and applications in empirical macroeconomics, Chapter 11. Edward Elgar, Cheltenham, pp 249–265
Breitung J, Eickmeier S (2011) Testing for structural breaks in dynamic factor models. J Econom 163:71–84
Article Google Scholar
Breitung J, Pigorsch U (2013) A canonical correlation approach for selecting the number of dynamic factors. Oxf Bull Econ Stat 75(1):23–36
Article Google Scholar
Buch CM, Eickmeier S, Prieto E (2014) Macroeconomic factors and microlevel bank behavior. J Money Credit Bank 46(4):715–751
Article Google Scholar
Camacho M, Lovcha Y, Perez-Quiros G (2015) Can we use seasonally adjusted variables in dynamic factor models? Stud Nonlinear Dyn Econom 19(3):377–391
Google Scholar
Caner M, Han X (2014) Selecting the correct number of factors in approximated factor models: the large panel case with group bridge estimators. Journal of Business & Economic. Statistics 32(3):359–374
Google Scholar
Connor G, Korajczyk RA (2010) Encyclopedia of quantitative finance: factor models of asset returns. Wiley, Chicester
Google Scholar
Delle Monache D, Petrella I, Venditti F (2016) Common faith or parting ways? A time varying parameters factor analysis of euro-area inflation in dynamic factor models. In: Hillebrand E, Koopman SJ (eds) Advances in econometrics, vol 35. Emerald Group Publishing Limited, Bingley, pp 539–565
Geweke J (1977) The dynamic factor analysis of economic time series. In: Aigner DJ, Goldberger AS (eds) Latent variables in socio-economic models. North-Holland, Amsterdam
Hallin M, Liska R (2007) Determining the number of factors in the general dynamic factor model. J Am Stat Assoc 102:603–617
Article Google Scholar
Han X, Caner M (2016) Determining the number of factors with potentially strong within-block correlation in error terms (unpublished manuscript)
Harding M (2013) Estimating the number of factors in large dimensional factor models. https://pdfs.semanticscholar.org/61e3/36d780a78a713e50a4519674166289b098f6.pdf
Jacobs JPAM, Otter PW (2008) Determining the number of factors and lag order in dynamic factor models: a minimum entropy approach. Econom Rev 27(4–6):385–397
Article Google Scholar
Jungbacker B, Koopman SJ (2015) Likelihood-based dynamic factor analysis for measurement and forecasting. Econom J 18:1–21
Article Google Scholar
Kapetanios G (2010) A testing procedure for determining the number of factors in approximate factor models with large datasets. Journal of Business & Economic. Statistics 28(3):397–409
Google Scholar
Lahiri K, Monokroussos G, Yongchen Z (2015) Forecasting consumption: the role of consumer confidence in real time with many predictors. J Appl Econom. doi:10.1002/jae.2494
Lahiri K, Sheng X (2010) Measuring forecast uncertainty by disagreement: the missing link. J Appl Econom 25:514–538
Article Google Scholar
Lahiri K, Yao W (2004) A dynamic factor model of the coincident indicators for the US transportation sector. Appl Econ Lett 11(10):595–600
Article Google Scholar
Luo R, Wang H, Tsai CL (2009) Contour projected dimension reduction. Ann Stat 37:3743–3778
Article Google Scholar
Moench E, Ng S, Potter S (2013) Dynamic hierarchical factor models. Rev Econ Stat 95(5):1811–1817
Article Google Scholar
Onatski A (2010) Determining the number of factors from empirical distribution of eigenvalues. Rev Econ Stat 92(4):1004–1016
Article Google Scholar
Onatski A (2012) Asymptotics of the principal components estimator of large factor models with weakly influential factors. J Econ 168:244–258
Article Google Scholar
Onatski A (2015) Asymptotic analysis of the squared estimation error in misspecified factor models. J Econ 186(2):388–406
Article Google Scholar
Pinheiro M, Rua A, Dias F (2013) Dynamic factor models with jagged edge panel data: taking on board the dynamics of the idiosyncratic components. Oxf Bull Econ Stat 75(1):80–102
Article Google Scholar
Poncela P, Senra E, Sierra LP (2014) Common dynamics of nonenergy commodity prices and their relation to uncertainty. Appl Econ 46(30):3724–3735
Article Google Scholar
Poncela P, Ruiz E (2016) Small versus big data factor extraction in dynamic factor models: an empirical assessment in dynamic factor models. In: Hillebrand E, Koopman SJ (eds) Advances in econometrics, vol 35. Emerald Group Publishing Limited, Bingley, pp 401–434
Reis R, Watson MW (2010) Relative goods’ prices, pure inflation, and the Phillips correlation. Am Econ J Macroecon 2(3):128–157
Article Google Scholar
Sargent TJ, Sims CA (1977) Business cycle modeling without pretending to have too much a priory economic theory. In: Sims CA (ed) New methods in business cycle research Minneapolis. Federal Reserve Bank of Minneapolis, Minneapolis
Google Scholar
Stock JH, Watson MW (2002) Forecasting using principal components from a large number of predictors. J Am Stat Assoc 97(1):1169–1179
Google Scholar
Stock JH, Watson MW (2005) Implications of dynamic factor models for VAR analysis. NBER Working Paper 11467
Stock JH, Watson MW (2011) Dynamic factor models. In: Clements MP, Hendry DF (eds) Oxford handbook of economic forecasting. Oxford University Press, Oxford
Google Scholar
Stock JH, Watson MW (2012) Disentangling the channels of the 2007–2009 recession Brookings Papers on Economic Activity. Spring 2012:81–130
Google Scholar
Stock JH, Watson MW (2012b) Generalized shrinkage methods for forecasting using many predictors. J Busi Econ Stat 30(4):481–493
Article Google Scholar
Wang H (2012) Factor profiled sure independence screening. Biometrika 99:15–28
Article Google Scholar

Download references

Acknowledgments

Financial support from the Spanish government, contract grants ECO2015-70331-C2-1-R and ECO2015-70331-C2-2-R (MINECO/FEDER) is gratefully acknowledged.

Author information

Authors and Affiliations

Department of Statistics, Universidad Carlos III de Madrid, Calle Madrid, 126, 28903, Getafe, Madrid, Spain
Francisco Corona & Esther Ruiz
European Commission, Joint Research Centre (JRC), Via Enrico Fermi 2749, 21027, Ispra, VA, Italy
Pilar Poncela
Universidad Autónoma de Madrid, Madrid, Spain
Pilar Poncela

Authors

Francisco Corona
View author publications
You can also search for this author in PubMed Google Scholar
Pilar Poncela
View author publications
You can also search for this author in PubMed Google Scholar
Esther Ruiz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pilar Poncela.

Additional information

This work was started while the Pilar Poncela was still at Universidad Autónoma de Madrid.

The views expressed in this paper are those of the authors and should not to be attributed to the European Commission.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Corona, F., Poncela, P. & Ruiz, E. Determining the number of factors after stationary univariate transformations. Empir Econ 53, 351–372 (2017). https://doi.org/10.1007/s00181-016-1158-5

Download citation

Received: 17 February 2016
Accepted: 22 July 2016
Published: 27 September 2016
Issue Date: August 2017
DOI: https://doi.org/10.1007/s00181-016-1158-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Determining the number of factors after stationary univariate transformations

Abstract

Similar content being viewed by others

The Structure of Generalized Linear Dynamic Factor Models

Simultaneous Statistical Inference in Dynamic Factor Models

Diagnostic Checks in Multiple Time Series Modelling

1 Introduction