# Robust Principal Component Analysis

• 9350 Accesses

Part of the Interdisciplinary Applied Mathematics book series (IAM,volume 40)

## Abstract

In the previous chapter, we considered the PCA problem under the assumption that all the sample points are drawn from the same statistical or geometric model: a low-dimensional subspace.

### Keywords

• Missing Entries
• Matrix Completion
• Principal Component Pursuit (PCP)
• Alternating Direction Method Of Multipliers (ADMM)
• Robust PCA (RPCA)

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Chapter
USD   29.95
Price excludes VAT (USA)
• DOI: 10.1007/978-0-387-87811-9_3
• Chapter length: 60 pages
• Own it forever
• Exclusive offer for individuals only
• Tax calculation will be finalised during checkout
eBook
USD   69.99
Price excludes VAT (USA)
• ISBN: 978-0-387-87811-9
• Own it forever
• Exclusive offer for individuals only
• Tax calculation will be finalised during checkout
Softcover Book
USD   89.99
Price excludes VAT (USA)
Hardcover Book
USD   89.99
Price excludes VAT (USA)

## Notes

1. 1.

If $$U \in \mathbb{R}^{D\times d}$$ and $$V \in \mathbb{R}^{N\times d}$$, then U and V have dD + dN degrees of freedom in general. However, to specify the subspace, it suffices to specify $$UV ^{\top }$$, which is equal to $$UAA^{-1}V ^{\top }$$ for every invertible matrix $$A \in \mathbb{R}^{d\times d}$$; hence the matrix $$UV ^{\top }$$ has dD + dNd 2 degrees of freedom.

2. 2.

Such conditions typically require that the linear measurements and the matrix X be in some sense incoherent.

3. 3.

X can be factorized as $$X = UAA^{-1}V ^{\top }$$, where $$U,V \in \mathbb{R}^{N\times d}$$ have Nd entries each, and $$A \in \mathbb{R}^{d\times d}$$ is an invertible matrix.

4. 4.

Previously, we have used M to denote the number of observed entries in a specific matrix X. Notice that here, M is the expected number of observed entries under a random model in which the locations are sampled independently and uniformly at random. Thus, if p is the probability that an entry is observed, then the expected number of observed entries is pDN. Therefore, one can state the result either in terms of p or in terms of the expected number of observed entries, as we have done. For ease of exposition, we will continue to refer to M as the number of observed entries in the main text, but the reader is reminded that all the theoretical results refer to the expected number of observed entries, because the model for the observed entries is random.

5. 5.

A polylog factor means a polynomial in the $$\log$$ function, i.e., O(polylog(N)) means $$O(\log (N)^{k})$$ for some integer k.

6. 6.

Further performance gains might be possible by replacing this partial SVD with an approximate SVD, as suggested in (Goldfarb and Ma 2009) for nuclear norm minimization.

7. 7.

For a more thorough exposition of outliers in statistics, we recommend the books of (Barnett and Lewis 1983; Huber 1981).

8. 8.

In fact, it can be shown that (Ferguson 1961), if the outliers have a Gaussian distribution of a different covariance matrix $$a\Sigma$$, then $$\varepsilon _{i}$$ is a sufficient statistic for the test that maximizes the probability of correct decision about the outlier (in the class of tests that are invariant under linear transformations). Interested readers may want to find out how this distance is equivalent (or related) to the sample influence $$\hat{\Sigma }_{N}^{(i)} -\hat{ \Sigma }_{N}$$ or the approximate sample influence given in (B.91).

9. 9.

## References

• Amaldi, E., & Kann, V. (1998). On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 209, 237–260.

• Bach, F. (2013). Convex relaxations of structured matrix factorizations. arXiv:1309.3117v1.

• Barnett, V., & Lewis, T. (1983). Outliers in statistical data (2nd ed.). New York: Wiley.

• Basri, R., & Jacobs, D. (2003). Lambertian reflection and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218–233.

• Bertsekas, D. P. (1999). Nonlinear programming (2nd ed.). Optimization and computation (Vol. 2) Belmont: Athena Scientific.

• Brandt, S. (2002). Closed-form solutions for affine reconstruction under missing data. In In Proceedings Statistical Methods for Video Processing (ECCV’02 Workshop).

• Buchanan, A., & Fitzgibbon, A. (2005). Damped Newton algorithms for matrix factorization with missing data. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 316–322).

• Burer, S., & Monteiro, R. D. C. (2005). Local minima and convergence in low-rank semidefinite programming. Mathematical Programming, Series A, 103(3), 427–444.

• Cai, J.-F., Candés, E. J., & Shen, Z. (2008). A singular value thresholding algorithm for matrix completion. SIAM Journal of Optimization, 20(4), 1956–1982.

• Candès, E. (2006). Compressive sampling. In Proceedings of the International Congress of Mathematics.

• Candès, E. (2008). The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique, 346(9–10), 589–592.

• Candès, E., Li, X., Ma, Y., & Wright, J. (2011). Robust principal component analysis? Journal of the ACM, 58(3).

• Candès, E., & Plan, Y. (2010). Matrix completion with noise. Proceedings of the IEEE, 98(6), 925–936.

• Candès, E., & Recht, B. (2009). Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9, 717–772.

• Candès, E., & Tao, T. (2005). Decoding by linear programming. IEEE Transactions on Information Theory, 51(12), 4203–4215.

• Candès, E., & Tao, T. (2010). The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5), 2053–2080.

• Chandrasekaran, V., Sanghavi, S., Parrilo, P., & Willsky, A. (2009). Sparse and low-rank matrix decompositions. In IFAC Symposium on System Identification.

• De la Torre, F., & Black, M. J. (2004). A framework for robust subspace learning. International Journal of Computer Vision, 54(1), 117–142.

• Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In Workshop on Generative Model Based Vision.

• Ferguson, T. (1961). On the rejection of outliers. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability.

• Fischler, M. A., & Bolles, R. C. (1981). RANSAC random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 26, 381–395.

• Ganesh, A., Wright, J., Li, X., Candès, E., & Ma, Y. (2010). Dense error correction for low-rank matrices via principal component pursuit. In International Symposium on Information Theory.

• Geman, S., & McClure, D. (1987). Statistical methods for tomographic image reconstruction. In Proceedings of the 46th Session of the ISI, Bulletin of the ISI (Vol. 52, pp. 5–21).

• Goldfarb, D., & Ma, S. (2009). Convergence of fixed point continuation algorithms for matrix rank minimization. Preprint.

• Golub, H., & Loan, C. V. (1996). Matrix Computations (2nd ed.). Baltimore: Johns Hopkins University Press.

• Gross, D. (2011). Recovering low-rank matrices from few coefficients in any basis. IEEE Trans on Information Theory, 57(3), 1548–1566.

• Gruber, A., & Weiss, Y. (2004). Multibody factorization with uncertainty and missing data using the EM algorithm. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. I, pp. 707–714).

• H.Aanaes, Fisker, R., Astrom, K., & Carstensen, J. M. (2002). Robust factorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1215–1225.

• Haeffele, B., & Vidal, R. (2015). Global optimality in tensor factorization, deep learning, and beyond. Preprint, http://arxiv.org/abs/1506.07540.

• Haeffele, B., Young, E., & Vidal, R. (2014). Structured low-rank matrix factorization: Optimality, algorithm, and applications to image processing. In International Conference on Machine Learning.

• Hardt, M. (2014). Understanding alternating minimization for matrix completion. In Symposium on Foundations of Computer Science.

• Hartley, R., & Schaffalitzky, F. (2003). Powerfactorization: An approach to affine reconstruction with missing and uncertain data. In Proceedings of Australia-Japan Advanced Workshop on Computer Vision.

• Huber, P. (1981). Robust Statistics. New York: Wiley.

• Jacobs, D. (2001). Linear fitting with missing data: Applications to structure-from-motion. Computer Vision and Image Understanding, 82, 57–81.

• Jain, P., Meka, R., & Dhillon, I. (2010). Guaranteed rank minimization via singular value projection. In Neural Information Processing Systems (pp. 937–945).

• Jain, P., & Netrapalli, P. (2014). Fast exact matrix completion with finite samples. In http://arxiv.org/pdf/1411.1087v1.pdf.

• Jain, P., Netrapalli, P., & Sanghavi, S. (2012). Low-rank matrix completion using alternating minimization. In http://arxiv.org/pdf/1411.1087v1.pdf.

• Johnson, C. (1990). Matrix completion problems: A survey. In Proceedings of Symposia in Applied Mathematics.

• Jolliffe, I. (2002). Principal Component Analysis (2nd ed.). New York: Springer.

• Ke, Q., & Kanade, T. (2005). Robust 1-norm factorization in the presence of outliers and missing data. In IEEE Conference on Computer Vision and Pattern Recognition.

• Keshavan, R., Montanari, A., & Oh, S. (2010a). Matrix completion from a few entries. IEEE Transactions on Information Theory.

• Keshavan, R., Montanari, A., & Oh, S. (2010b). Matrix completion from noisy entries. Journal of Machine Learning Research, 11, 2057–2078.

• Keshavan, R. H. (2012). Efficient algorithms for collaborative filtering. Ph.D. Thesis. Stanford University.

• Kontogiorgis, S., & Meyer, R. (1989). A variable-penalty alternating direction method for convex optimization. Mathematical Programming, 83, 29–53.

• Lanczos, C. (1950). An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. Journal of Research of the National Bureau of Standards, 45, 255–282.

• Lin, Z., Chen, M., Wu, L., & Ma, Y. (2011). The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv:1009.5055v2.

• Lions, P., & Mercier, B. (1979). Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis, 16(6), 964–979.

• Recht, B., Fazel, M., & Parrilo, P. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3), 471–501.

• Shum, H.-Y., Ikeuchi, K., & Reddy, R. (1995). Principal component analysis with missing data and its application to polyhedral object modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(9), 854–867.

• Soltanolkotabi, M., & Candès, E. J. (2013). A geometric analysis of subspace clustering with outliers. Annals of Statistics, 40(4), 2195–2238.

• Steward, C. V. (1999). Robust parameter estimation in computer vision. SIAM Review, 41(3), 513–537.

• Udell, M., Horn, C., Zadeh, R., & Boyd, S. (2015). Generalized low rank models. Working manuscript.

• Wiberg, T. (1976). Computation of principal components when data are missing. In Symposium on Computational Statistics (pp. 229–326).

• Wright, J., Ganesh, A., Kerui, M., & Ma, Y. (2013). Compressive principal component analysis. IMA Journal on Information and Inference, 2(1), 32–68.

• Wright, J., Ganesh, A., Rao, S., Peng, Y., & Ma, Y. (2009a). Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In NIPS.

• Xu, H., Caramanis, C., & Sanghavi, S. (2010). Robust pca via outlier pursuit. In Neural Information Processing Systems (NIPS).

• Yuan, X., & Yang, J. (2009). Sparse and low-rank matrix decomposition via alternating direction methods. Preprint.

• Zhou, M., Wang, C., Chen, M., Paisley, J., Dunson, D., & Carin, L. (2010a). Nonparametric bayesian matrix completion. In Sensor Array and Multichannel Signal Processing Workshop.

• Zhou, Z., Wright, J., Li, X., Candès, E., & Ma, Y. (2010b). Stable principal component pursuit. In International Symposium on Information Theory.

Authors

## Rights and permissions

Reprints and Permissions