Skip to main content
Log in

Estimation of dynamic mixed double factors model in high-dimensional panel data

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper endeavors to develop some dimension reduction techniques in panel data analysis when the numbers of individuals and indicators are very large. We use principal component analysis method to represent a large number of indicators via minority common factors in the factor models. We propose the dynamic mixed double factor model (DMDFM for short) to reflect cross section and time series correlation with the interactive factor structure. DMDFM not only reduces the dimension of indicators but also deals with the time series and cross section mixed effect. Different from other models, mixed factor models have two styles of common factors. The regressors factors reflect common trend and the dimension reducing, while the error components factors reflect difference and weak correlation of individuals. The results of Monte Carlo simulation show that generalized method of moments estimators have good properties of unbiasedness and consistency. Simulation results also show that the DMDFM can improve the prediction power of the models effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Ahn SG, Lee YH, Schmidt P (2001) GMM Estimation of linear panel data models with time-varying individual effects. J Econ 101:219–255

    Article  MathSciNet  Google Scholar 

  • Andrews DWK (2005) Cross-section regression with common shocks. Econometrica 73:1551–1585

    Article  MathSciNet  Google Scholar 

  • Anderson B, Deistler M (2008) Generalized linear dynamic factor models—a structure theory. In: 2008 IEEE conference on decision and control

  • Arellano M, Bond SR (1991) Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. Rev Econ Stud 58:277–297

    Article  Google Scholar 

  • Arellano M, Bover O (1995) Another look at the instrumental variable estimation of error components models. J Econ 68:29–51

    Article  Google Scholar 

  • Bai J (2003) Inferential theory for factor models of large dimensions. Econometrica 71:135–173

    Article  MathSciNet  Google Scholar 

  • Bai J (2009) Panel data models with interactive fixed effects. Econometrica 77:1229–1279

    Article  MathSciNet  Google Scholar 

  • Bai J, Ng S (2002) Determining the number of factors in approximate factor models. Econometrica 70:191–221

    Article  MathSciNet  Google Scholar 

  • Chamberlain G, Rothschild M (1983) Arbitrage, factor structure and mean-variance analysis in large asset markets. Econometrica 51:1281–1304

    Article  MathSciNet  Google Scholar 

  • Fan J, Fan Y, Lv J (2008) High dimensional covariance matrix estimation using a factor model. J Econ 147:186–197

    Article  MathSciNet  Google Scholar 

  • Forni M, Hallin M, Lippi M, Reichlin L (2000) The generalized dynamic factor model: identification and estimation. Rev Econ Stat 82:540–554

    Article  Google Scholar 

  • Hallin M, Liska R (2007) Determining the number of factors in the general dynamic factor model. J Am Stat Assoc 102:603–617

    Article  MathSciNet  Google Scholar 

  • Hansen LP (1982) Large sample properties of generalized method of moments estimators. Econometrica 50:1029–1054

    Article  MathSciNet  Google Scholar 

  • Harding M, Nair KK (2009) Estimating the number of factors and lags in high dimensional dynamic factor models. Mimeo

  • Hsiao C (2003) Analysis of panel data. Cambridge University Press, New York

    Book  Google Scholar 

  • Mallows CL (1973) Some comments on Cp. Technometrics 15:661–675

    MATH  Google Scholar 

  • Moon HR, Perron B (2004) Testing for a unit root in panels with dynamic factors. J Econ 122:81–126

    Article  MathSciNet  Google Scholar 

  • Newey W, Mcfadden D (1994) Large sample estimation and hypothesis testing. In: Engle RF, McFadden D (eds) Handbook of econometrics. North Holland, Amsterdam, pp 2111–2245

    Google Scholar 

  • Pesaran MH (2006) Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74:967–1012

    Article  MathSciNet  Google Scholar 

  • Ross S (1976) The arbitrage theory of capital asset pricing. J Econ Theory 13:341–360

    Article  MathSciNet  Google Scholar 

  • Stock JH, Watson MW (2002) Forecasting using principal components from a large number of predictors. J Am Stat Assoc 97:1167–1179

    Article  MathSciNet  Google Scholar 

  • Stock JH, Watson MW (2005) Implications of dynamic factor models for VAR analysis. Princeton University, Princeton

    Book  Google Scholar 

Download references

Acknowledgements

This study was funded by National Natural Science Foundation of China (714711730, 71873137 & 71271210) and supported by fund for building world-class universities (disciplines) of Renmin University of China. Fang’s study was funded by The Philosophy and Social Science Fund of Anhui (AHSKY2015D53).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by Y. Ni.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proof of theoretical results

Appendix: Proof of theoretical results

A. Proof of Theorem 4.1.

Denote \(b(z,\beta )=Z_{i}\Delta \epsilon _{i}\), where \(\beta =(\beta _{L}^{'},\beta _{F}^{'})_{'}\). From Eq. (4.8), we have \(E[b(z,\beta )]=0\). We calculate partial derivative for each parameter to be estimated, \(\partial b(z,\beta )/\partial \beta \), then let

$$\begin{aligned} Db(\beta _{L},\beta _{F})=\left( \partial b(b(z,\beta )/\partial \beta _{L}^{'},\partial b(z,\beta )/\partial \beta _{F}^{'}\right) ^{'} \end{aligned}$$

because the uniform consistency of random disturbance term, using Taylor series expansion around \(\beta _{L}\) and \(\beta _{F}\):

$$\begin{aligned} b(z,{\hat{\beta }})= & {} b(z,\beta )+Db\left( \beta _{L}^{*},\beta _{F}^{*}\right) (b(z, {\hat{\beta }})\nonumber \\&-\,b(z,\beta ))+o(b(z,\beta )) \end{aligned}$$
(6.1)

where \({\hat{\beta }}=({\hat{\beta }}_{L}^{'},{\hat{\beta }}_{F}^{'})^{'}\), \(\beta _{L}^{*}\), \(\beta _{F}^{*}\) are between \(\beta _{L}\), \({\hat{\beta }}_{L}\), and \(\beta _{F}\), \({\hat{\beta }}_{F}\), respectively, multiplied by weighting matrix A simultaneously:

$$\begin{aligned} Ab(z,{\hat{\beta }})= & {} A b(z,\beta )+ADb\left( \beta _{L}^{*},\beta _{F}^{*}\right) (b(z,{\hat{\beta }})\nonumber \\&-\,b(z,\beta ))+o(b(z,\beta )) \end{aligned}$$
(6.2)

Given the following three items:

  1. (i)

    From assumptions as before, given optimal weighting matrix \(A_{O}\), we can obtain unique optimal estimator of \(\beta \). \(\beta \) is continuous vector defined on Euclid space \(R^{n}\), and space \(\Theta \) constituted by \(\beta \) is a subset of \(R^{n}\), and is closed and bounded.

  2. (ii)

    For \(b(z,\beta )=Z_{i}\Delta \epsilon _{i}\), \(\forall \epsilon >0\), from (6.1)

    $$\begin{aligned} E(b(z,{\hat{\beta }}))=b(z,\beta ) \end{aligned}$$

    so,

    $$\begin{aligned} \big |b(z,{\hat{\beta }})-b(z,\beta )\big |\xrightarrow {p}0 \end{aligned}$$
    (6.3)

    for given matrix A, denote

    $$\begin{aligned} \hat{S}_{N}(\beta )=b\big (z,{\hat{\beta }}\big )^{'}\hat{A}bv(z,{\hat{\beta }}\big ) \end{aligned}$$

    and

    $$\begin{aligned} S_{0}(\beta )=b(z,\beta )^{'}Ab(z,\beta ) \end{aligned}$$

    from (A.3), \(S_{0}(\beta )\) is continuous.

  3. (iii)

    Next, prove \(S_{0}(\beta )\) convergence with probability 1.

$$\begin{aligned}&\left| \hat{S}_{N}(\beta )-S_{0}(\beta )\right| =\left| b(z,{\hat{\beta }})^{'}\hat{A}b (z,{\hat{\beta }})-b(z,\beta )^{'}Ab(z,\beta )\right| \\&\quad =\,\bigg |\big (b(z,{\hat{\beta }})-b(z,\beta )\big )^{'}\hat{A}\big (b(z,{\hat{\beta }})-b(z,\beta )\big )\\&\qquad +\,b(z,\beta )^{'}\hat{A}(b(z,{\hat{\beta }})-b(z,\beta ))\\&\qquad +\,b(z,\beta )^{'}\hat{A}b(z,\beta )-b(z,\beta )-b(z,\beta )^{'}Ab(z,\beta )\bigg |\\&\quad =\,\bigg |\big (b(z,{\hat{\beta }})-b(z,\beta )\big )^{'}\hat{A}\big (b(z,{\hat{\beta }})-b(z,\beta )\big )\\&\qquad +\,b(z,\beta )^{'}\hat{A}(b(z,{\hat{\beta }})-b(z,\beta ))^{'}\hat{A} (b(z,\beta )\\&\qquad +\,b(z,\beta )^{'}(\hat{A}-A)b(z,\beta )\bigg |\\&\quad =\,\bigg |\big (b(z,{\hat{\beta }})-b(z,\beta )\big )^{'}\hat{A}\big (b(z,{\hat{\beta }})-b(z,\beta )\big )\\&\qquad +\,b(z,\beta )^{'}(\hat{A}+\hat{A}^{'})(b(z,{\hat{\beta }})\\&\qquad -\,b(z,\beta ))+b(z,\beta )^{'}(\hat{A}-A)b(z,\beta )\bigg | \end{aligned}$$

Using triangle inequalities

$$\begin{aligned}&\le \bigg |(b(z,{\hat{\beta }})-b(z,\beta ))^{'}\hat{A}(b(z,{\hat{\beta }})\\&\quad -b(z,\beta ))\bigg |+\bigg |b(z,\beta )^{'}(\hat{A}+\hat{A}^{'})(b(z,{\hat{\beta }})\\&\quad -b(z,\beta ))\bigg | +\bigg |b(z,\beta )^{'}(\hat{A}-A)b(z,\beta )\bigg | \end{aligned}$$

Using Cauchy–Schwartz inequalities

$$\begin{aligned}&\le \big \Vert b(z,{\hat{\beta }})-b(z,\beta )\big \Vert ^{2}\big \Vert \hat{A}\big \Vert +2\big \Vert b(z,\beta )\big \Vert \big \Vert b(z,{\hat{\beta }})\\&\quad -\,b(z,\beta )\big \Vert \big \Vert \hat{A}\big \Vert +\big \Vert b(z,\beta )\big \Vert ^{2}\big \Vert \hat{A}-A\big \Vert \end{aligned}$$

because

$$\begin{aligned}&b(z,{\hat{\beta }})-b(z,\beta )\xrightarrow {p}0\\&\quad \hat{A}-A\xrightarrow {p}0 \end{aligned}$$

we have

$$\begin{aligned} \left| \hat{S}_{N}(\beta )-S_{0}(\beta )\right| \xrightarrow {p}0 \end{aligned}$$

By Newey and Mcfadden (1994), following uniform convergence theorem, the conclusion is obtained. \(\square \)

B. Proof of Theorem 4.2.

  1. (1)

    Because

    $$\begin{aligned}&\partial R_{1}(\beta _{L},\beta _{F})/\partial \beta =\partial \left( b(z,\beta )^{'}Ab(z,\beta )\right) /\partial \beta \\&\quad =\partial \left( b(z,\beta )^{'}/\partial \beta Ab(z,\beta )\right) +\partial \left( b(z,\beta )^{'}/\partial \beta Ab(z,\beta )\right) \\&\quad =2\partial \left( b(z,\beta )^{'}/\partial \beta Ab(z,\beta )\right) \end{aligned}$$

    where \(\beta =(\beta _{L},\beta _{F})^{'}\) for notation simplicity. Following this notation, in order to estimate GMM, we solve first-order condition, so we obtain that

    $$\begin{aligned} R_{1}({\hat{\beta }})^{'}Ab(z,{\hat{\beta }})=0 \end{aligned}$$
    (6.4)

    from (6.1), for optimal matrix \(A_{O}\), we have

    $$\begin{aligned} R_{1}(\beta )^{'}A_{O}b(z,{\hat{\beta }})= & {} R_{1}(\beta )^{'}A_{O}\sqrt{N}b (z,{\hat{\beta }})\nonumber \\&+\,o(b(z,\beta )) \end{aligned}$$
    (6.5)

    using Taylor series expansion around \(\beta \)

    $$\begin{aligned}&R_{1}(\beta )^{'}A_{O}b(z,{\hat{\beta }})=R_{1}(\beta )^{'}A_{O} \left( \sqrt{N}b(z,\beta ) \right. \\&\quad \left. +\,R_{1}(\beta )\sqrt{N}\left( {\hat{\beta }}-\beta \right) \right) +o(b(z,\beta )) \end{aligned}$$

    from (6.4),we have

    $$\begin{aligned}&R_{1}(\beta )^{'}A_{O}R_{1}(\beta )\sqrt{N}\left( {\hat{\beta }}-\beta \right) \\&\quad =-R_{1}(\beta )^{'}A_{O}\sqrt{N}b(z,\beta )+o(b(z,\beta )) \end{aligned}$$

    so

    $$\begin{aligned}&\sqrt{N}({\hat{\beta }}-\beta )=-\left( R_{1}(\beta )^{'}A_{O}R_{1}(\beta )\right) ^{-1}\\&\quad R_{1}(\beta )^{'}A_{O}\sqrt{N}b(z,\beta )+o(b(z,\beta )) \end{aligned}$$

    by Eq. (4.15) as previous, we have

    $$\begin{aligned} \sqrt{N}b(z,\beta )\xrightarrow {d}N(0,D_{1}) \end{aligned}$$

    and

    $$\begin{aligned} \left( R_{1}(\beta )^{'}A_{O}R_{1}(\beta )\right) ^{-1}R_{1}(\beta )^{'}A_{O} \end{aligned}$$

    is a determined matrix, so

    $$\begin{aligned} \sqrt{N}\left( {\hat{\beta }}-\beta \right) \xrightarrow {d}N(0,\Sigma _{1}) \end{aligned}$$

    i.e.,

    $$\begin{aligned} \sqrt{N}\left( \left( {\hat{\beta }}_{L},{\hat{\beta }}_{F}\right) -(\beta _{L},\beta _{F})\right) \xrightarrow {d}N(0,\Sigma _{1}) \end{aligned}$$
  2. (2)

    Similar to the proof of (1), omitted.

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, G., Zhang, B. & Chen, K. Estimation of dynamic mixed double factors model in high-dimensional panel data. Soft Comput 24, 2527–2541 (2020). https://doi.org/10.1007/s00500-018-3603-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3603-1

Keywords

Navigation