Abstract
We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen?
We show that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys
for some positive numerical constant C, then with very high probability, most n×n matrices of rank r can be perfectly recovered by solving a simple convex optimization program. This program finds the matrix with minimum nuclear norm that fits the data. The condition above assumes that the rank is not too large. However, if one replaces the 1.2 exponent with 1.25, then the result holds for all values of the rank. Similar results hold for arbitrary rectangular matrices as well. Our results are connected with the recent literature on compressed sensing, and show that objects other than signals and images can be perfectly reconstructed from very limited information.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
ACM SIGKDD, Netflix, Proceedings of KDD Cup and Workshop (2007). Proceedings available online at http://www.cs.uic.edu/~liub/KDD-cup-2007/proceedings.html.
T. Ando, R.A. Horn, C.R. Johnson, The singular values of a Hadamard product: A basic inequality, Linear Multilinear Algebra 21, 345–365 (1987).
Y. Azar, A. Fiat, A. Karlin, F. McSherry, J. Saia, Spectral analysis of data, in Proceedings of the Thirty-third Annual ACM Symposium on Theory of Computing (2001).
C. Beck, R. D’Andrea, Computational study and comparisons of LFT reducibility methods, in Proceedings of the American Control Conference (1998).
D.P. Bertsekas, A. Nedic, A.E. Ozdaglar, Convex Analysis and Optimization (Athena Scientific, Belmont, 2003).
B. Bollobás, Random Graphs, 2nd edn. (Cambridge University Press, Cambridge, 2001).
A. Buchholz, Operator Khintchine inequality in non-commutative probability, Math. Ann. 319, 1–16 (2001).
J.-F. Cai, E.J. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion, Technical report (2008). Preprint available at http://arxiv.org/abs/0810.3286.
E.J. Candès, J. Romberg, Sparsity and incoherence in compressive sampling, Inverse Probl. 23(3), 969–985 (2007).
E.J. Candès, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inf. Theory 52(2), 489–509 (2006).
E.J. Candès, T. Tao, Decoding by linear programming, IEEE Trans. Inf. Theory 51(12), 4203–4215 (2005).
E.J. Candès, T. Tao, Near optimal signal recovery from random projections: Universal encoding strategies?, IEEE Trans. Inf. Theory 52(12), 5406–5425 (2006).
A.L. Chistov, D.Yu. Grigoriev, Complexity of quantifier elimination in the theory of algebraically closed fields, in Proceedings of the 11th Symposium on Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 176 (Springer, Berlin, 1984), pp. 17–31.
V.H. de la Peña, Decoupling and Khintchine’s inequalities for U-statistics, Ann. Probab. 20(4), 1877–1892 (1992).
V.H. de la Peña, S.J. Montgomery-Smith, Decoupling inequalities for the tail probabilities of multivariate U-statistics, Ann. Probab. 23(2), 806–816 (1995).
D.L. Donoho, Compressed sensing, IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006).
P. Drineas, M.W. Mahoney, S. Muthukrishnan, Subspace sampling and relative-error matrix approximation: Column-based methods, in Proceedings of the Tenth Annual RANDOM (2006).
P. Drineas, M.W. Mahoney, S. Muthukrishnan, Subspace sampling and relative-error matrix approximation: Column-row-based methods, in Proceedings of the Fourteenth Annual ESA (2006).
M. Fazel, Matrix rank minimization with applications, Ph.D. thesis, Stanford University (2002).
R.A. Horn, C.R. Johnson, Topics in Matrix Analysis (Cambridge University Press, Cambridge, 1994). Corrected reprint of the 1991 original.
T. Klein, E. Rio, Concentration around the mean for maxima of empirical processes, Ann. Probab. 33(3), 1060–1077 (2005).
B. Laurent, P. Massart, Adaptive estimation of a quadratic functional by model selection, Ann. Stat. 28(5), 1302–1338 (2000).
M. Ledoux, The Concentration of Measure Phenomenon (AMS, Providence, 2001).
A.S. Lewis, The mathematics of eigenvalue optimization, Math. Programm. 97(1–2), 155–176 (2003).
N. Linial, E. London, Y. Rabinovich, The geometry of graphs and some of its algorithmic applications, Combinatorica 15, 215–245 (1995).
F. Lust-Picquard, Inégalités de Khintchine dans C p (1<p<∞), C. R. Acad. Sci. Paris, Sér. I 303(7), 289–292 (1986).
S. Ma, D. Goldfarb, L. Chen, Fixed point and Bregman iterative methods for matrix rank minimization, Technical report (2008).
M. Mesbahi, G.P. Papavassilopoulos, On the rank minimization problem over a positive semidefinite linear matrix inequality, IEEE Trans. Automat. Control 42(2), 239–243 (1997).
B. Recht, M. Fazel, P. Parrilo, Guaranteed minimum rank solutions of matrix equations via nuclear norm minimization, SIAM Rev. (2007, submitted). Preprint available at http://arxiv.org/abs/0706.4138.
J.D.M. Rennie, N. Srebro, Fast maximum margin matrix factorization for collaborative prediction, in Proceedings of the International Conference of Machine Learning (2005).
M. Rudelson, Random vectors in the isotropic position, J. Funct. Anal. 164(1), 60–72 (1999).
M. Rudelson, R. Vershynin, Sampling from large matrices: an approach through geometric functional analysis, J. ACM, 54(4), Art. 21, 19 pp. (electronic) (2007).
A.M.-C. So, Y. Ye, Theory of semidefinite programming for sensor network localization, Math. Program., Ser. B, 109, 2007.
N. Srebro, Learning with matrix factorizations, Ph.D. thesis, Massachusetts Institute of Technology, (2004).
M. Talagrand, New concentration inequalities in product spaces, Invent. Math. 126(3), 505–563 (1996).
K.C. Toh, M.J. Todd, R.H. Tütüncü, SDPT3—a MATLAB software package for semidefinite-quadratic-linear programming. Available from http://www.math.nus.edu.sg/~mattohkc/sdpt3.html.
L. Vandenberghe, S.P. Boyd, Semidefinite programming, SIAM Rev. 38(1), 49–95 (1996).
G.A. Watson, Characterization of the subdifferential of some matrix norms, Linear Algebra Appl. 170, 33–45 (1992).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Michael Todd.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Candès, E.J., Recht, B. Exact Matrix Completion via Convex Optimization. Found Comput Math 9, 717–772 (2009). https://doi.org/10.1007/s10208-009-9045-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-009-9045-5
Keywords
- Matrix completion
- Low-rank matrices
- Convex optimization
- Duality in optimization
- Nuclear norm minimization
- Random matrices
- Noncommutative Khintchine inequality
- Decoupling
- Compressed sensing