Nearly optimal stochastic approximation for online principal subspace estimation

Liang, Xin; Guo, Zhen-Chen; Wang, Li; Li, Ren-Cang; Lin, Wen-Wei

doi:10.1007/s11425-021-1972-5

Nearly optimal stochastic approximation for online principal subspace estimation

Articles
Published: 30 August 2022

Volume 66, pages 1087–1122, (2023)
Cite this article

Science China Mathematics Aims and scope Submit manuscript

Xin Liang^1,2,
Zhen-Chen Guo³,
Li Wang⁴,
Ren-Cang Li^4,5 &
…
Wen-Wei Lin^6,7

117 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Principal component analysis (PCA) has been widely used in analyzing high-dimensional data. It converts a set of observed data points of possibly correlated variables into a set of linearly uncorrelated variables via an orthogonal transformation. To handle streaming data and reduce the complexities of PCA, (subspace) online PCA iterations were proposed to iteratively update the orthogonal transformation by taking one observed data point at a time. Existing works on the convergence of (subspace) online PCA iterations mostly focus on the case where the samples are almost surely uniformly bounded. In this paper, we analyze the convergence of a subspace online PCA iteration under more practical assumption and obtain a nearly optimal finite-sample error bound. Our convergence rate almost matches the minimax information lower bound. We prove that the convergence is nearly global in the sense that the subspace online PCA iteration is convergent with high probability for random initial guesses. This work also leads to a simpler proof of the recent work on analyzing online PCA for the first principal component only.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Near-optimal stochastic approximation for online principal component estimation

Article 19 August 2017

On the optimality of the Oja’s algorithm for online PCA

Article 30 March 2023

Convergence analysis of Oja’s iteration for solving online PCA with nonzero-mean samples

Article 14 April 2020

References

Abed-Meraim K, Attallah S, Chkeif A, et al. Orthogonal Oja algorithm. IEEE Signal Process Lett, 2000, 7: 116–119
Article Google Scholar
Absil P A, Edelman A, Koev P. On the largest principal angle between random subspaces. Linear Algebra Appl, 2006, 414: 288–294
Article MathSciNet MATH Google Scholar
Allen-Zhu Z, Li Y. First efficient convergence for streaming k-PCA: A global, gap-free, and near-optimal rate. In: 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). New York: IEEE, 2017, 487–492
Google Scholar
Arora R, Cotter A, Livescu K, et al. Stochastic optimization for PCA and PLS. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton). New York: IEEE, 2012, 861–868
Google Scholar
Arora R, Cotter A, Srebro N. Stochastic optimization of PCA with capped MSG. Adv Neural Inform Process Syst, 2013, 26: 1815–1823
Google Scholar
Balcan M F, Du S S, Wang Y, et al. An improved gap-dependency analysis of the noisy power method. In: Proceedings of the 29th Annual Conference on Learning Theory, vol. 49. San Diego: PMLR, 2016, 284–309
Google Scholar
Balsubramani A, Dasgupta S, Freund Y. The fast convergence of incremental PCA. Adv Neural Inform Process Syst, 2013, 2: 3174–3182
Google Scholar
Blum A, Hopcroft J, Kannan R. Foundations of Data Science. New York: Cambridge University Press, 2020
Book MATH Google Scholar
Chikuse Y. Statistics on Special Manifolds. New York: Springer, 2003
Book MATH Google Scholar
De Sa C, Olukotun K, Ré C. Global convergence of stochastic gradient descent for some non-convex matrix problems. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37. San Diego: PLMR, 2015, 2332–2341
Google Scholar
Demmel J. Applied Numerical Linear Algebra. Philadelphia: SIAM, 1997
Book MATH Google Scholar
Garber D, Hazan E, Jin C, et al. Faster eigenvector computation via shift-and-invert preconditioning. In: Proceedings of the 33rd International Conference on Machine Learning, vol. 48. San Diego: JMLR, 2016, 2626–2634
Google Scholar
Hardt M, Price E. The noisy power method: A meta algorithm with applications. Adv Neural Inform Process Syst, 2014, 27: 2861–2869
Google Scholar
Horn R A, Johnson C R. Topics in Matrix Analysis. Cambridge: Cambridge University Press, 1991
Book MATH Google Scholar
Hotelling H. Analysis of a complex of statistical variables into principal components. J Educational Psych, 1933, 24: 417–441
Article MATH Google Scholar
Jain P, Jin C, Kakade S M, et al. Streaming PCA: Matching matrix Bernstein and near-optimal finite sample guarantees for Oja’s algorithm. In: Proceedings of The 29th Conference on Learning Theory (COLT). New York: COLT, 2016, 1147–1164
Google Scholar
James A T. Normal multivariate analysis and the orthogonal group. Ann Math Statist, 1954, 25: 40–75
Article MathSciNet MATH Google Scholar
Li C L, Lin H T, Lu C J. Rivalry of two families off algorithms for memory-restricted streaming PCA. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS). San Diego: JMLR, 2016, 473–481
Google Scholar
Li C J, Wang M D, Liu H, et al. Near-optimal stochastic approximation for online principal component estimation. Math Program, 2018, 167: 75–97
Article MathSciNet MATH Google Scholar
Luke Y L. The Special Functions and Their Approximations. New York: Academic Press, 1969
MATH Google Scholar
Marinov T V, Mianjy P, Arora R. Streaming principal component analysis in noisy settings. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80. San Diego: PMLR, 2018, 3413–3422
Google Scholar
Mianjy P, Arora R. Stochastic PCA with ℓ₂ and ℓ₁ regularization. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80. San Diego: PMLR, 2018, 3531–3539
Google Scholar
Muirhead R J. Aspects of Multivariate Statistical Theory. Wiley Series in Probability and Mathematical Statistics. New York: John Wiley & Sons, 1982
Book MATH Google Scholar
Oja E. Simplified neuron model as a principal component analyzer. J Math Biol, 1982, 15: 267–273
Article MathSciNet MATH Google Scholar
Oja E, Karhunen J. On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix. J Math Anal Appl, 1985, 106: 69–84
Article MathSciNet MATH Google Scholar
Pearson K F R S. On lines and planes of closest fit to systems of points in space. Philos Mag, 1901, 2: 559–572
Article MATH Google Scholar
Shamir O. Convergence of stochastic gradient descent for PCA. In: Proceedings of the 33rd International Conference on Machine Learning, vol. 48. San Diego: PMLR, 2016, 257–265
Google Scholar
Stewart G W, Sun J G. Matrix Perturbation Theory. Boston: Academic Press, 1990
MATH Google Scholar
Tropp J A. User-friendly tail bounds for sums of random matrices. Found Comput Math, 2012. 12: 389–434
Article MathSciNet MATH Google Scholar
van der Vaart A W, Wellner J A. Weak Convergence and Empirical Processes. Springer Series in Statistics. New York: Springer, 1996
Book MATH Google Scholar
Vershynin R. Introduction to the non-asymptotic analysis of random matrices. In: Compressed Sensing: Theory and Applications. New York: Cambridge University Press, 2012, 210–268
Google Scholar
Vu V Q, Lei J. Minimax sparse principal subspace estimation in high dimensions. Ann Statist, 2013, 41: 2905–2947
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 11901340), National Science Foundation of USA (Grant Nos. DMS-1719620 and DMS-2009689), Ministry of Science and Technology of Taiwan, Taiwanese Center for Theoretical Sciences, and the ST Yau Centre at the Taiwan Chiao Tung University. The authors are indebted to the referees for their constructive comments and suggestions that improved the presentation.

Author information

Authors and Affiliations

Yau Mathematical Sciences Center, Tsinghua University, Beijing, 100084, China
Xin Liang
Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, 101408, China
Xin Liang
Department of Mathematics, Nanjing University, Nanjing, 210093, China
Zhen-Chen Guo
Department of Mathematics, University of Texas at Arlington, Arlington, TX, 76019, USA
Li Wang & Ren-Cang Li
Department of Mathematics, Hong Kong Baptist University, Hong Kong, China
Ren-Cang Li
Nanjing Center for Applied Mathematics, Nanjing, 211135, China
Wen-Wei Lin
Department of Applied Mathematics, Yang Ming Chiao Tung University, Hsinchu, 300, China
Wen-Wei Lin

Authors

Xin Liang
View author publications
You can also search for this author in PubMed Google Scholar
Zhen-Chen Guo
View author publications
You can also search for this author in PubMed Google Scholar
Li Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ren-Cang Li
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ren-Cang Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liang, X., Guo, ZC., Wang, L. et al. Nearly optimal stochastic approximation for online principal subspace estimation. Sci. China Math. 66, 1087–1122 (2023). https://doi.org/10.1007/s11425-021-1972-5

Download citation

Received: 28 February 2021
Accepted: 23 May 2022
Published: 30 August 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11425-021-1972-5

Keywords

MSC(2020)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nearly optimal stochastic approximation for online principal subspace estimation

Abstract

Access this article

Similar content being viewed by others

Near-optimal stochastic approximation for online principal component estimation

On the optimality of the Oja’s algorithm for online PCA

Convergence analysis of Oja’s iteration for solving online PCA with nonzero-mean samples

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

MSC(2020)

Navigation

Nearly optimal stochastic approximation for online principal subspace estimation

Abstract

Access this article

Similar content being viewed by others

Near-optimal stochastic approximation for online principal component estimation

On the optimality of the Oja’s algorithm for online PCA

Convergence analysis of Oja’s iteration for solving online PCA with nonzero-mean samples

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

MSC(2020)

Search

Navigation