Exact Matrix Completion via Convex Optimization

Open Access
Article

Abstract

We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen?

We show that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys
$$m\ge C\,n^{1.2}r\log n$$
for some positive numerical constant C, then with very high probability, most n×n matrices of rank r can be perfectly recovered by solving a simple convex optimization program. This program finds the matrix with minimum nuclear norm that fits the data. The condition above assumes that the rank is not too large. However, if one replaces the 1.2 exponent with 1.25, then the result holds for all values of the rank. Similar results hold for arbitrary rectangular matrices as well. Our results are connected with the recent literature on compressed sensing, and show that objects other than signals and images can be perfectly reconstructed from very limited information.

Keywords

Matrix completion Low-rank matrices Convex optimization Duality in optimization Nuclear norm minimization Random matrices Noncommutative Khintchine inequality Decoupling Compressed sensing 

Mathematics Subject Classification (2000)

90C25 90C59 15A52 

References

  1. 1.
    ACM SIGKDD, Netflix, Proceedings of KDD Cup and Workshop (2007). Proceedings available online at http://www.cs.uic.edu/~liub/KDD-cup-2007/proceedings.html.
  2. 2.
    T. Ando, R.A. Horn, C.R. Johnson, The singular values of a Hadamard product: A basic inequality, Linear Multilinear Algebra 21, 345–365 (1987). MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Y. Azar, A. Fiat, A. Karlin, F. McSherry, J. Saia, Spectral analysis of data, in Proceedings of the Thirty-third Annual ACM Symposium on Theory of Computing (2001). Google Scholar
  4. 4.
    C. Beck, R. D’Andrea, Computational study and comparisons of LFT reducibility methods, in Proceedings of the American Control Conference (1998). Google Scholar
  5. 5.
    D.P. Bertsekas, A. Nedic, A.E. Ozdaglar, Convex Analysis and Optimization (Athena Scientific, Belmont, 2003). MATHGoogle Scholar
  6. 6.
    B. Bollobás, Random Graphs, 2nd edn. (Cambridge University Press, Cambridge, 2001). MATHGoogle Scholar
  7. 7.
    A. Buchholz, Operator Khintchine inequality in non-commutative probability, Math. Ann. 319, 1–16 (2001). MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    J.-F. Cai, E.J. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion, Technical report (2008). Preprint available at http://arxiv.org/abs/0810.3286.
  9. 9.
    E.J. Candès, J. Romberg, Sparsity and incoherence in compressive sampling, Inverse Probl. 23(3), 969–985 (2007). MATHCrossRefGoogle Scholar
  10. 10.
    E.J. Candès, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inf. Theory 52(2), 489–509 (2006). CrossRefGoogle Scholar
  11. 11.
    E.J. Candès, T. Tao, Decoding by linear programming, IEEE Trans. Inf. Theory 51(12), 4203–4215 (2005). CrossRefGoogle Scholar
  12. 12.
    E.J. Candès, T. Tao, Near optimal signal recovery from random projections: Universal encoding strategies?, IEEE Trans. Inf. Theory 52(12), 5406–5425 (2006). CrossRefGoogle Scholar
  13. 13.
    A.L. Chistov, D.Yu. Grigoriev, Complexity of quantifier elimination in the theory of algebraically closed fields, in Proceedings of the 11th Symposium on Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 176 (Springer, Berlin, 1984), pp. 17–31. Google Scholar
  14. 14.
    V.H. de la Peña, Decoupling and Khintchine’s inequalities for U-statistics, Ann. Probab. 20(4), 1877–1892 (1992). MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    V.H. de la Peña, S.J. Montgomery-Smith, Decoupling inequalities for the tail probabilities of multivariate U-statistics, Ann. Probab. 23(2), 806–816 (1995). MATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    D.L. Donoho, Compressed sensing, IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006). CrossRefMathSciNetGoogle Scholar
  17. 17.
    P. Drineas, M.W. Mahoney, S. Muthukrishnan, Subspace sampling and relative-error matrix approximation: Column-based methods, in Proceedings of the Tenth Annual RANDOM (2006). Google Scholar
  18. 18.
    P. Drineas, M.W. Mahoney, S. Muthukrishnan, Subspace sampling and relative-error matrix approximation: Column-row-based methods, in Proceedings of the Fourteenth Annual ESA (2006). Google Scholar
  19. 19.
    M. Fazel, Matrix rank minimization with applications, Ph.D. thesis, Stanford University (2002). Google Scholar
  20. 20.
    R.A. Horn, C.R. Johnson, Topics in Matrix Analysis (Cambridge University Press, Cambridge, 1994). Corrected reprint of the 1991 original. MATHGoogle Scholar
  21. 21.
    T. Klein, E. Rio, Concentration around the mean for maxima of empirical processes, Ann. Probab. 33(3), 1060–1077 (2005). MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    B. Laurent, P. Massart, Adaptive estimation of a quadratic functional by model selection, Ann. Stat. 28(5), 1302–1338 (2000). MATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    M. Ledoux, The Concentration of Measure Phenomenon (AMS, Providence, 2001). MATHGoogle Scholar
  24. 24.
    A.S. Lewis, The mathematics of eigenvalue optimization, Math. Programm. 97(1–2), 155–176 (2003). MATHGoogle Scholar
  25. 25.
    N. Linial, E. London, Y. Rabinovich, The geometry of graphs and some of its algorithmic applications, Combinatorica 15, 215–245 (1995). MATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    F. Lust-Picquard, Inégalités de Khintchine dans C p (1<p<∞), C. R. Acad. Sci. Paris, Sér. I 303(7), 289–292 (1986). Google Scholar
  27. 27.
    S. Ma, D. Goldfarb, L. Chen, Fixed point and Bregman iterative methods for matrix rank minimization, Technical report (2008). Google Scholar
  28. 28.
    M. Mesbahi, G.P. Papavassilopoulos, On the rank minimization problem over a positive semidefinite linear matrix inequality, IEEE Trans. Automat. Control 42(2), 239–243 (1997). MATHCrossRefMathSciNetGoogle Scholar
  29. 29.
    B. Recht, M. Fazel, P. Parrilo, Guaranteed minimum rank solutions of matrix equations via nuclear norm minimization, SIAM Rev. (2007, submitted). Preprint available at http://arxiv.org/abs/0706.4138.
  30. 30.
    J.D.M. Rennie, N. Srebro, Fast maximum margin matrix factorization for collaborative prediction, in Proceedings of the International Conference of Machine Learning (2005). Google Scholar
  31. 31.
    M. Rudelson, Random vectors in the isotropic position, J. Funct. Anal. 164(1), 60–72 (1999). MATHCrossRefMathSciNetGoogle Scholar
  32. 32.
    M. Rudelson, R. Vershynin, Sampling from large matrices: an approach through geometric functional analysis, J. ACM, 54(4), Art. 21, 19 pp. (electronic) (2007). Google Scholar
  33. 33.
    A.M.-C. So, Y. Ye, Theory of semidefinite programming for sensor network localization, Math. Program., Ser. B, 109, 2007. Google Scholar
  34. 34.
    N. Srebro, Learning with matrix factorizations, Ph.D. thesis, Massachusetts Institute of Technology, (2004). Google Scholar
  35. 35.
    M. Talagrand, New concentration inequalities in product spaces, Invent. Math. 126(3), 505–563 (1996). MATHCrossRefMathSciNetGoogle Scholar
  36. 36.
    K.C. Toh, M.J. Todd, R.H. Tütüncü, SDPT3—a MATLAB software package for semidefinite-quadratic-linear programming. Available from http://www.math.nus.edu.sg/~mattohkc/sdpt3.html.
  37. 37.
    L. Vandenberghe, S.P. Boyd, Semidefinite programming, SIAM Rev. 38(1), 49–95 (1996). MATHCrossRefMathSciNetGoogle Scholar
  38. 38.
    G.A. Watson, Characterization of the subdifferential of some matrix norms, Linear Algebra Appl. 170, 33–45 (1992). MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  1. 1.Applied and Computational MathematicsCaltechPasadenaUSA
  2. 2.Center for the Mathematics of InformationCaltechPasadenaUSA

Personalised recommendations