Abstract
We consider an online version of the Principal Component Analysis (PCA), where the goal is to keep track of a subspace of small dimension which captures most of the variance of the data arriving sequentially in a stream. We assume the data stream is evolving and hence the target subspace is changing over time. We cast this problem as a prediction problem, where the goal is to minimize the total compression loss on the data sequence. We review the most popular methods for online PCA and show that the state-of-the-art IPCA algorithm is unable to track the best subspace in this setting. We then propose two modifications of this algorithm, and show that they exhibit a much better predictive performance than the original version of IPCA. Our algorithms are compared against other popular method for online PCA in a computational experiment on real data sets from computer vision.
The authors acknowledge support from the Polish National Science Centre (grant no. 2016/22/E/ST6/00299).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Obtained from: http://www.cs.toronto.edu/~dross/ivt/.
- 2.
Obtained from: http://www1.cs.columbia.edu/CAVE/software/softlib/coil-100.php.
References
Arora, R., Cotter, A., Livescu, K., Srebro, N.: Stochastic optimization for PCA and PLS. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing, pp. 861–868 (2012)
Arora, R., Cotter, A., Srebro, N.: Stochastic optimization of PCA with capped MSG. In: NIPS, pp. 1815–1823 (2013)
Brand, M.: Incremental singular value decomposition of uncertain data with missing values. In: ECCV (2002)
Cardot, H., Degras, D.: Online principal component analysis in high dimension: which algorithm to choose? Int. Stat. Rev. (2017)
Gama, J.: Knowledge Discovery from Data Streams. Chapman & Hall/CRC, Boca Raton (2010)
Hall, P.M., Marshall, D., Martin, R.R.: Incremental eigenanalysis for classification. In: British Machine Vision Conference, pp. 286–295 (1998)
Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(417–441), 498–520 (1933)
Levy, A., Lindenbaum, M.: Sequential Karhunen-Loeve basis extraction and its application to images. IEEE Trans. Image Process. 9(8), 1371–1374 (2000)
Liberty, E.: Simple and deterministic matrix sketching. In: International Conference on Knowledge Discovery and Data Mining, pp. 581–588. ACM (2013)
Oja, E., Karhunen, J.: On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix. J. Math. Anal. Appl. 106(1), 69–84 (1985)
Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1–3), 125–141 (2008)
Warmuth, M.K., Kuzmin, D.: Randomized online PCA algorithms with regret bounds that are logarithmic in the dimension. J. Mach. Learn. Res. 9, 2287–2320 (2008)
Zhao, H., Yuen, P.C., Kwok, J.T.: A novel incremental principal component analysis and its application for face recognition. IEEE Trans. Syst. Man Cybern. Part B 36(4), 873–886 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Grabowska, M., Kotłowski, W. (2018). Online Principal Component Analysis for Evolving Data Streams. In: Czachórski, T., Gelenbe, E., Grochla, K., Lent, R. (eds) Computer and Information Sciences. ISCIS 2018. Communications in Computer and Information Science, vol 935. Springer, Cham. https://doi.org/10.1007/978-3-030-00840-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-00840-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00839-0
Online ISBN: 978-3-030-00840-6
eBook Packages: Computer ScienceComputer Science (R0)