Skip to main content

Robust Principal Component Analysis

Part of the Interdisciplinary Applied Mathematics book series (IAM,volume 40)

Abstract

In the previous chapter, we considered the PCA problem under the assumption that all the sample points are drawn from the same statistical or geometric model: a low-dimensional subspace.

Keywords

  • Missing Entries
  • Matrix Completion
  • Principal Component Pursuit (PCP)
  • Alternating Direction Method Of Multipliers (ADMM)
  • Robust PCA (RPCA)

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-0-387-87811-9_3
  • Chapter length: 60 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   69.99
Price excludes VAT (USA)
  • ISBN: 978-0-387-87811-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   89.99
Price excludes VAT (USA)
Hardcover Book
USD   89.99
Price excludes VAT (USA)
Fig. 3.1
Fig. 3.2
Fig. 3.3
Fig. 3.4
Fig. 3.5
Fig. 3.6
Fig. 3.7

Notes

  1. 1.

    If \(U \in \mathbb{R}^{D\times d}\) and \(V \in \mathbb{R}^{N\times d}\), then U and V have dD + dN degrees of freedom in general. However, to specify the subspace, it suffices to specify \(UV ^{\top }\), which is equal to \(UAA^{-1}V ^{\top }\) for every invertible matrix \(A \in \mathbb{R}^{d\times d}\); hence the matrix \(UV ^{\top }\) has dD + dNd 2 degrees of freedom.

  2. 2.

    Such conditions typically require that the linear measurements and the matrix X be in some sense incoherent.

  3. 3.

    X can be factorized as \(X = UAA^{-1}V ^{\top }\), where \(U,V \in \mathbb{R}^{N\times d}\) have Nd entries each, and \(A \in \mathbb{R}^{d\times d}\) is an invertible matrix.

  4. 4.

    Previously, we have used M to denote the number of observed entries in a specific matrix X. Notice that here, M is the expected number of observed entries under a random model in which the locations are sampled independently and uniformly at random. Thus, if p is the probability that an entry is observed, then the expected number of observed entries is pDN. Therefore, one can state the result either in terms of p or in terms of the expected number of observed entries, as we have done. For ease of exposition, we will continue to refer to M as the number of observed entries in the main text, but the reader is reminded that all the theoretical results refer to the expected number of observed entries, because the model for the observed entries is random.

  5. 5.

    A polylog factor means a polynomial in the \(\log\) function, i.e., O(polylog(N)) means \(O(\log (N)^{k})\) for some integer k.

  6. 6.

    Further performance gains might be possible by replacing this partial SVD with an approximate SVD, as suggested in (Goldfarb and Ma 2009) for nuclear norm minimization.

  7. 7.

    For a more thorough exposition of outliers in statistics, we recommend the books of (Barnett and Lewis 1983; Huber 1981).

  8. 8.

    In fact, it can be shown that (Ferguson 1961), if the outliers have a Gaussian distribution of a different covariance matrix \(a\Sigma \), then \(\varepsilon _{i}\) is a sufficient statistic for the test that maximizes the probability of correct decision about the outlier (in the class of tests that are invariant under linear transformations). Interested readers may want to find out how this distance is equivalent (or related) to the sample influence \(\hat{\Sigma }_{N}^{(i)} -\hat{ \Sigma }_{N}\) or the approximate sample influence given in (B.91).

  9. 9.

    http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html.

References

  • Amaldi, E., & Kann, V. (1998). On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 209, 237–260.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Bach, F. (2013). Convex relaxations of structured matrix factorizations. arXiv:1309.3117v1.

    Google Scholar 

  • Barnett, V., & Lewis, T. (1983). Outliers in statistical data (2nd ed.). New York: Wiley.

    MATH  Google Scholar 

  • Basri, R., & Jacobs, D. (2003). Lambertian reflection and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218–233.

    CrossRef  Google Scholar 

  • Bertsekas, D. P. (1999). Nonlinear programming (2nd ed.). Optimization and computation (Vol. 2) Belmont: Athena Scientific.

    Google Scholar 

  • Brandt, S. (2002). Closed-form solutions for affine reconstruction under missing data. In In Proceedings Statistical Methods for Video Processing (ECCV’02 Workshop).

    Google Scholar 

  • Buchanan, A., & Fitzgibbon, A. (2005). Damped Newton algorithms for matrix factorization with missing data. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 316–322).

    Google Scholar 

  • Burer, S., & Monteiro, R. D. C. (2005). Local minima and convergence in low-rank semidefinite programming. Mathematical Programming, Series A, 103(3), 427–444.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Cai, J.-F., Candés, E. J., & Shen, Z. (2008). A singular value thresholding algorithm for matrix completion. SIAM Journal of Optimization, 20(4), 1956–1982.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Candès, E. (2006). Compressive sampling. In Proceedings of the International Congress of Mathematics.

    Google Scholar 

  • Candès, E. (2008). The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique, 346(9–10), 589–592.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Candès, E., Li, X., Ma, Y., & Wright, J. (2011). Robust principal component analysis? Journal of the ACM, 58(3).

    Google Scholar 

  • Candès, E., & Plan, Y. (2010). Matrix completion with noise. Proceedings of the IEEE, 98(6), 925–936.

    CrossRef  Google Scholar 

  • Candès, E., & Recht, B. (2009). Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9, 717–772.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Candès, E., & Tao, T. (2005). Decoding by linear programming. IEEE Transactions on Information Theory, 51(12), 4203–4215.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Candès, E., & Tao, T. (2010). The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5), 2053–2080.

    MathSciNet  CrossRef  Google Scholar 

  • Chandrasekaran, V., Sanghavi, S., Parrilo, P., & Willsky, A. (2009). Sparse and low-rank matrix decompositions. In IFAC Symposium on System Identification.

    Google Scholar 

  • De la Torre, F., & Black, M. J. (2004). A framework for robust subspace learning. International Journal of Computer Vision, 54(1), 117–142.

    MATH  Google Scholar 

  • Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In Workshop on Generative Model Based Vision.

    Google Scholar 

  • Ferguson, T. (1961). On the rejection of outliers. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability.

    Google Scholar 

  • Fischler, M. A., & Bolles, R. C. (1981). RANSAC random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 26, 381–395.

    MathSciNet  CrossRef  Google Scholar 

  • Ganesh, A., Wright, J., Li, X., Candès, E., & Ma, Y. (2010). Dense error correction for low-rank matrices via principal component pursuit. In International Symposium on Information Theory.

    Google Scholar 

  • Geman, S., & McClure, D. (1987). Statistical methods for tomographic image reconstruction. In Proceedings of the 46th Session of the ISI, Bulletin of the ISI (Vol. 52, pp. 5–21).

    Google Scholar 

  • Goldfarb, D., & Ma, S. (2009). Convergence of fixed point continuation algorithms for matrix rank minimization. Preprint.

    Google Scholar 

  • Golub, H., & Loan, C. V. (1996). Matrix Computations (2nd ed.). Baltimore: Johns Hopkins University Press.

    MATH  Google Scholar 

  • Gross, D. (2011). Recovering low-rank matrices from few coefficients in any basis. IEEE Trans on Information Theory, 57(3), 1548–1566.

    MathSciNet  CrossRef  Google Scholar 

  • Gruber, A., & Weiss, Y. (2004). Multibody factorization with uncertainty and missing data using the EM algorithm. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. I, pp. 707–714).

    Google Scholar 

  • H.Aanaes, Fisker, R., Astrom, K., & Carstensen, J. M. (2002). Robust factorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1215–1225.

    Google Scholar 

  • Haeffele, B., & Vidal, R. (2015). Global optimality in tensor factorization, deep learning, and beyond. Preprint, http://arxiv.org/abs/1506.07540.

  • Haeffele, B., Young, E., & Vidal, R. (2014). Structured low-rank matrix factorization: Optimality, algorithm, and applications to image processing. In International Conference on Machine Learning.

    Google Scholar 

  • Hardt, M. (2014). Understanding alternating minimization for matrix completion. In Symposium on Foundations of Computer Science.

    Google Scholar 

  • Hartley, R., & Schaffalitzky, F. (2003). Powerfactorization: An approach to affine reconstruction with missing and uncertain data. In Proceedings of Australia-Japan Advanced Workshop on Computer Vision.

    Google Scholar 

  • Huber, P. (1981). Robust Statistics. New York: Wiley.

    CrossRef  MATH  Google Scholar 

  • Jacobs, D. (2001). Linear fitting with missing data: Applications to structure-from-motion. Computer Vision and Image Understanding, 82, 57–81.

    CrossRef  MATH  Google Scholar 

  • Jain, P., Meka, R., & Dhillon, I. (2010). Guaranteed rank minimization via singular value projection. In Neural Information Processing Systems (pp. 937–945).

    Google Scholar 

  • Jain, P., & Netrapalli, P. (2014). Fast exact matrix completion with finite samples. In http://arxiv.org/pdf/1411.1087v1.pdf.

  • Jain, P., Netrapalli, P., & Sanghavi, S. (2012). Low-rank matrix completion using alternating minimization. In http://arxiv.org/pdf/1411.1087v1.pdf.

  • Johnson, C. (1990). Matrix completion problems: A survey. In Proceedings of Symposia in Applied Mathematics.

    Google Scholar 

  • Jolliffe, I. (2002). Principal Component Analysis (2nd ed.). New York: Springer.

    MATH  Google Scholar 

  • Ke, Q., & Kanade, T. (2005). Robust 1-norm factorization in the presence of outliers and missing data. In IEEE Conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  • Keshavan, R., Montanari, A., & Oh, S. (2010a). Matrix completion from a few entries. IEEE Transactions on Information Theory.

    Google Scholar 

  • Keshavan, R., Montanari, A., & Oh, S. (2010b). Matrix completion from noisy entries. Journal of Machine Learning Research, 11, 2057–2078.

    MathSciNet  MATH  Google Scholar 

  • Keshavan, R. H. (2012). Efficient algorithms for collaborative filtering. Ph.D. Thesis. Stanford University.

    Google Scholar 

  • Kontogiorgis, S., & Meyer, R. (1989). A variable-penalty alternating direction method for convex optimization. Mathematical Programming, 83, 29–53.

    MathSciNet  MATH  Google Scholar 

  • Lanczos, C. (1950). An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. Journal of Research of the National Bureau of Standards, 45, 255–282.

    MathSciNet  CrossRef  Google Scholar 

  • Lin, Z., Chen, M., Wu, L., & Ma, Y. (2011). The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv:1009.5055v2.

    Google Scholar 

  • Lions, P., & Mercier, B. (1979). Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis, 16(6), 964–979.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Recht, B., Fazel, M., & Parrilo, P. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3), 471–501.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Shum, H.-Y., Ikeuchi, K., & Reddy, R. (1995). Principal component analysis with missing data and its application to polyhedral object modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(9), 854–867.

    CrossRef  Google Scholar 

  • Soltanolkotabi, M., & Candès, E. J. (2013). A geometric analysis of subspace clustering with outliers. Annals of Statistics, 40(4), 2195–2238.

    MathSciNet  CrossRef  MATH  Google Scholar 

  • Steward, C. V. (1999). Robust parameter estimation in computer vision. SIAM Review, 41(3), 513–537.

    MathSciNet  CrossRef  Google Scholar 

  • Udell, M., Horn, C., Zadeh, R., & Boyd, S. (2015). Generalized low rank models. Working manuscript.

    Google Scholar 

  • Wiberg, T. (1976). Computation of principal components when data are missing. In Symposium on Computational Statistics (pp. 229–326).

    Google Scholar 

  • Wright, J., Ganesh, A., Kerui, M., & Ma, Y. (2013). Compressive principal component analysis. IMA Journal on Information and Inference, 2(1), 32–68.

    CrossRef  MATH  Google Scholar 

  • Wright, J., Ganesh, A., Rao, S., Peng, Y., & Ma, Y. (2009a). Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In NIPS.

    Google Scholar 

  • Xu, H., Caramanis, C., & Sanghavi, S. (2010). Robust pca via outlier pursuit. In Neural Information Processing Systems (NIPS).

    Google Scholar 

  • Yuan, X., & Yang, J. (2009). Sparse and low-rank matrix decomposition via alternating direction methods. Preprint.

    Google Scholar 

  • Zhou, M., Wang, C., Chen, M., Paisley, J., Dunson, D., & Carin, L. (2010a). Nonparametric bayesian matrix completion. In Sensor Array and Multichannel Signal Processing Workshop.

    Google Scholar 

  • Zhou, Z., Wright, J., Li, X., Candès, E., & Ma, Y. (2010b). Stable principal component pursuit. In International Symposium on Information Theory.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer-Verlag New York

About this chapter

Cite this chapter

Vidal, R., Ma, Y., Sastry, S.S. (2016). Robust Principal Component Analysis. In: Generalized Principal Component Analysis. Interdisciplinary Applied Mathematics, vol 40. Springer, New York, NY. https://doi.org/10.1007/978-0-387-87811-9_3

Download citation