Abstract
Factor model analysis has emerged as a powerful tool to capture the latent dynamic structure of functional data from a dimension-reduction viewpoint. Conventional methods for estimating the factor model are sensitive to heavy tails and outliers. To address this issue and achieve robustness, we provide an eigenvalue-ratio based method to estimate the number of factors by replacing the covariance operator with the functional pairwise spatial sign operator. Moreover, we propose a two-step robust approach to recover the factor space. The convergence rates of the robust estimators for factor loadings, factor scores, and common components are derived under some mild conditions. Numerical studies and a real data analysis confirm the proposed procedures remain reliable even when the factors and idiosyncratic errors have heavy-tailed distributions.
Similar content being viewed by others
References
Alonso AM, Galeano P, Peña D (2020) A robust procedure to build dynamic factor models with cluster structure. J Econom 216(1):35–52
Aneiros G, Cao R, Vieu P (2019) Editorial on the special issue on functional data analysis and related topics
Aneiros G, Horová I, Hušková M, Vieu P (2022) On functional data analysis and related topics. J Multivar Anal 189:104861
Bali JL, Boente G (2017) Robust estimators under a functional common principal components model. Comput Stat Data Anal 113:424–440
Bali JL, Boente G, Tyler DE, Wang JL (2011) Robust functional principal components: a projection-pursuit approach. Ann Stat 39(6):2852–2882
Bardsley P, Horváth L, Kokoszka P, Young G (2017) Change point tests in functional factor models with application to yield curves. Econom J 20(1):86–117
Boente G, Salibián-Barrera M (2021) Robust functional principal components for sparse longitudinal data. Metron 79(2):159–188
Chen L, Wang W, Wu WB (2021) Dynamic semiparametric factor model with structural breaks. J Bus Econ Stat 39(3):757–771
Dai X, Müller HG (2018) Principal component analysis for functional data on Riemannian manifolds and spheres. Ann Stat 46(6B):3334–3361
Febrero-Bande M, Galeano P, González-Manteiga W (2017) Functional principal component regression and functional partial least-squares regression: an overview and a comparative study. Int Stat Rev 85(1):61–83
Gao Y, Shang HL, Yang Y (2019) High-dimensional functional time series forecasting: an application to age-specific mortality rates. J Multivar Anal 170:232–243
Gao Y, Shang HL, Yang Y (2021) Factor-augmented smoothing model for functional data. arXiv preprint arXiv:2102.02580
Gervini D (2009) Detecting and handling outlying trajectories in irregularly sampled functional datasets. Ann Appl Stat 3(4):1758–1775
Guo S, Qiao X, Wang Q (2021) Factor modelling for high-dimensional functional time series. arXiv preprint arXiv:2112.13651
Hall P, Hosseini-Nasab M (2006) On properties of functional principal components analysis. J R Stat Soc Ser B Stat Methodol 68(1):109–126
Hall P, Müller HG, Wang JL (2006) Properties of principal component methods for functional and longitudinal data analysis. Ann Appl Stat 34(3):1493–1517
Hallin M, Nisol G, Tavakoli S (2023) Factor models for high-dimensional functional time series I: representation results. J Time Ser Anal 44(5–6):578–600
Han F, Liu H (2018) Eca: high-dimensional elliptical component analysis in non-gaussian distributions. J Am Stat Assoc 113(521):252–268
Hays S, Shen H, Huang JZ (2012) Functional dynamic factor models with application to yield curve forecasting. Ann Appl Stat 6(3):870–894
He Y, Kong X, Yu L, Zhang X (2022) Large-dimensional factor analysis without moment constraints. J Bus Econ Stat 40(1):302–312
He Y, Li L, Liu D, Zhou WX (2023) Huber principal component analysis for large-dimensional factor models. arXiv preprint arXiv:2303.02817
Horváth L, Kokoszka P (2012) Inference for functional data with applications. Springer, Berlin
Horváth L, Li B, Li H, Liu Z (2020) Time-varying beta in functional factor models: evidence from china. N Am J Econ Finance 54:101283
Kokoszka P, Miao H, Zhang X (2015) Functional dynamic factor model for intraday price curves. J Financ Econom 13(2):456–477
Kowal DR, Canale A (2021) Semiparametric functional factor models with Bayesian rank selection. arXiv preprint arXiv:2108.02151
Li G, Huang JZ, Shen H (2018) Exponential family functional data analysis via a low-rank model. Biometrics 74(4):1301–1310
Li D, Qiao X, Wang Z (2023) Factor-guided estimation of large covariance matrix function with conditional functional sparsity. arXiv preprint arXiv:2311.02450
Ling N, Vieu P (2018) Nonparametric modelling for functional data: selected survey and tracks for future. Statistics 52(4):934–949
Lu J, Han F, Liu H (2021) Robust scatter matrix estimation for high dimensional distributions with heavy tail. IEEE Trans Inf Theory 67(8):5283–5304
Otto S, Salish N (2022) Approximate factor models for functional time series. arXiv preprint arXiv:2201.02532
Park Y, Oh HS, Lim Y (2024) A data-adaptive dimension reduction for functional data via penalized low-rank approximation. Stat Comput 34(36):66
Ran H, Bai Y (2021) On soft Bayesian additive regression trees and asynchronous longitudinal regression analysis. arXiv preprint arXiv:2108.11603
Sawant P, Billor N, Shin H (2012) Functional outlier detection with robust functional principal component analysis. Comput Stat 27:83–102
Stock JH, Watson MW (2012) Dynamic factor models. Oxford University Press, Oxford
Stock JH, Watson MW (2016) Dynamic factor models, factor-augmented vector autoregressions, and structural vector autoregressions in macroeconomics. In: Handbook of macroeconomics, vol 2. Elsevier, pp 415–525
Tang C, Shang HL, Yang Y (2021) Multi-population mortality forecasting using high-dimensional functional factor models. arXiv preprint arXiv:2109.04146
Tang C, Shang HL, Yang Y (2022) Clustering and forecasting multiple functional time series. Ann Appl Stat 16(4):2523–2553
Tavakoli S, Nisol G, Hallin M (2019) High-dimensional functional factor models. arXiv preprint arXiv:1905.10325
Tavakoli S, Nisol G, Hallin M (2023) Factor models for high-dimensional functional time series II: estimation and forecasting. J Time Ser Anal 44(5–6):600–621
Wang D, Liu X, Chen R (2019) Factor models for matrix-valued high-dimensional time series. J Econom 208(1):231–248
Wang G, Liu S, Han F, Di C (2021) Robust functional principal component analysis via functional pairwise spatial signs. arXiv preprint arXiv:2101.06415
Wen S, Lin H (2022) Factor-guided functional pca for high-dimensional functional data. arXiv preprint arXiv:2211.12012
Wohl DA, Zeng D, Stewart P, Glomb N, Alcorn T, Jones S, Handy J, Fiscus S, Weinberg A, Gowda D et al (2005) Cytomegalovirus viremia, mortality, and end-organ disease among patients with aids receiving potent antiretroviral therapies. J Acquir Immune Defic Syndr 38(5):538–544
Yang X, Du L (2023) Robust multiple testing under high-dimensional dynamic factor model. arXiv preprint arXiv:2303.07631
Yao F, Müller HG, Wang JL (2005) Functional data analysis for sparse longitudinal data. J Am Stat Assoc 100(470):577–590
Yu L, He Y, Zhang X (2019) Robust factor number specification for large-dimensional elliptical factor model. J Multivar Anal 174:104543
Zhong R, Liu S, Li H, Zhang J (2022a) Functional principal component analysis estimator for non-Gaussian data. J Stat Comput Simul 92(13):2788–2801
Zhong R, Liu S, Li H, Zhang J (2022b) Robust functional principal component analysis for non-Gaussian longitudinal data. J Multivar Anal 189:104864
Acknowledgements
The authors thank the Editor-in-Chief, an Associate Editor, and two anonymous reviewers for many helpful and constructive comments. This research was sponsored by the National Natural Science Foundation of China (Grant No. 72071068) and China Scholarship Council (202206690042).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Lemma 3.1 Based on Assumptions 3.1\(-\)3.3, we can get \(N(C_2+o(1))\le \lambda _i(\Gamma )\le N(C_1+o(1))\) for \(i\le k\); \(C_2\le \lambda _i(\Gamma )\le C_1\) for \(k<i\le N\); \(\lambda _i(\Gamma )\le C_1\) for \(i>N\). Moreover, by Theorem 2.4 in Zhong et al. (2022b), \( \lambda _{i}(K)=\mathbb {E}\left[ \frac{\lambda _{i}(\Gamma ) U_{i}^{2}}{\sum _{i=1}^{\infty } \lambda _{i}(\Gamma ) U_{i}^{2}}\right] \) with \( U_{i}=\frac{\langle Y-{\tilde{Y}}, \phi _{i}\rangle }{\sqrt{2 \lambda _{i}(\Gamma )} }\) and \( \sum _{i=1}^{\infty } \lambda _{i}(K)=1 \). Thus, for \(i\le k\),
where \(a_1=\frac{\lambda _1(\Gamma )}{C_1}-1\), and we have
Similarly,
which implies that \(\lambda _i(K)\ge \frac{C_2}{2kC_1}+o(1)\).
For \(k<i\le N\), we have
For \(i> N\), we have
Proof of Theorem 3.1
By Lemma 2.1, Lemma 3.1, and Weyl’s theorem, \( \lambda _{r}({\widehat{K}}) \asymp 1, r \le k \) and \(\lambda _{r}({\widehat{K}})=O_{p}(N^{-1 / 2}), r>k \). Let \(\alpha =N^{-1 / 2}\), then
Therefore,
which concludes the consistency.
Proof of Theorem 3.2 Similar to the proof of Lemma 2.3 in Horváth and Kokoszka (2012), we have
where \( \alpha =\min \left\{ \lambda _{1}-\lambda _{2}, \ldots , \lambda _{K-1}-\lambda _{K}, \lambda _{K}\right\} \), and the Hilbert-Schmidt norm of a Hilbert-Schmidt operator S is defined by
where \( \left\{ e_{1}, e_{2}, \ldots \right\} \) is any orthonormal basis. Then the asymptotic result of the factors follows from Lemma 2.1. In addition, by Assumption 3.3, \(\varvec{S}=\textrm{sgn}(\frac{1}{N} \sum _{i=1}^{N}(\widehat{\varvec{l}}_i \varvec{l}_i^{\top })) =\textrm{diag}\{s_1,\dots ,s_k\}\) with entries \(\pm 1\). Then,
Recall that \({\widehat{l}}_{ih}=\langle Y_{i}, {\widehat{f}}_{h}\rangle \), \(\frac{1}{N}\sum _{i=1}^{N}\sum _{h=1}^{k}({\widehat{l}}_{ih}-s_h l_{ih})^2=\frac{1}{N}\sum _{i=1}^{N}\sum _{h=1}^{k} \int _\mathcal {I}(Y_i(t)({\widehat{f}}_h(t)-s_h f_h(t))+\epsilon _i(t)s_h f_h(t)+s_h l_{ih}(f_h(t)f_h(t)-1))\textrm{dt}=O_p(N^{-1})\), where the last equality follows by Assumption 3.1 and the orthonormality of the factor curves, which concludes the proof of the theorem.
Proof of Corollary 3.1 By Theorem 3.2 and triangular inequality, we have
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, S., Ling, N. Robust estimation of functional factor models with functional pairwise spatial signs. Comput Stat (2024). https://doi.org/10.1007/s00180-024-01477-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00180-024-01477-2