Skip to main content
Log in

Max-norm optimization for robust matrix recovery

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

This paper studies the matrix completion problem under arbitrary sampling schemes. We propose a new estimator incorporating both max-norm and nuclear-norm regularization, based on which we can conduct efficient low-rank matrix recovery using a random subset of entries observed with additive noise under general non-uniform and unknown sampling distributions. This method significantly relaxes the uniform sampling assumption imposed for the widely used nuclear-norm penalized approach, and makes low-rank matrix recovery feasible in more practical settings. Theoretically, we prove that the proposed estimator achieves fast rates of convergence under different settings. Computationally, we propose an alternating direction method of multipliers algorithm to efficiently compute the estimator, which bridges a gap between theory and practice of machine learning methods with max-norm regularization. Further, we provide thorough numerical studies to evaluate the proposed method using both simulated and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Abernethy, J., Bach, F., Evgeniou, T., Vert, J.-P.: A new approach to collaborative filtering: operator estimation with spectral regularization. J. Mach. Learn. Res. 10, 803–826 (2009)

    MATH  Google Scholar 

  2. Amit, Y., Fink, M., Srebro, N., Ullman, S.: Uncovering shared structures in multiclass classification. In: Proceedings of the 24th International Conference on Machine Learning. ACM (2007)

  3. Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73, 243–272 (2008)

    Article  Google Scholar 

  4. Bennett, J., Lanning, S.: The Netflix prize. In: Proceedings of KDD Cup and Workshop. http://www.cs.uic.edu/~liub/KDD-cup-2007/proceedings.html (2007)

  5. Biswas, P., Liang, T., Toh, K., Wang, T., Ye, Y.: Semidefinite programming approaches for sensor network localization with noisy distance measurements. IEEE Trans. Autom. Sci. Eng. 3, 360–371 (2006)

    Article  Google Scholar 

  6. Cai, J.-F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20, 1956–1982 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cai, T.T., Zhou, W.-X.: Matrix completion via max-norm constrained optimization. Electron. J. Stat. 10, 1493–1525 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  8. Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58, 1–37 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9, 717–772 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Candès, E.J., Tao, T.: The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56, 2053–2080 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chen, C., He, B., Yuan, X.: Matrix completion via an alternating direction method. IMA J. Numer. Anal. 32, 227–245 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  12. Doan, X., Vavasis, S.: Finding approximately rank-one submatrices with the nuclear norm and \(\ell _1\)-norm. SIAM J. Optim. 23, 2502–2540 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  13. Drusvyatskiy, D., Vavasis, S., Wolkowicz, H.: Extreme point inequalities and geometry of the rank sparsity ball. Math. Program. 152, 521–544 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  14. Fang, E.X., He, B., Liu, H., Yuan, X.: Generalized alternating direction method of multipliers: new theoretical insights and applications. Math. Prog. Comp. 7, 149–187 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  15. Fazel, M., Hindi, H., Boyd, S.P.: A rank minimization heuristic with application to minimum order system approximation. In: Proceedings of the American Control Conference, vol. 6. IEEE (2001)

  16. Figueiredo, M., Nowak, R., Wright, S.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process. 1, 586598 (2007)

    Article  Google Scholar 

  17. Jalali, A., Srebro, N.: Clustering using max-norm constrained optimization. In: Proceedings of the 29th International Conference on Machine Learning (ICML-12) (2012)

  18. Jameson, G.J.O.: Summing and Nuclear Norms in Banach Space Theory, vol. 8. Cambridge University Press, Cambridge (1987)

    Book  MATH  Google Scholar 

  19. Keshavan, R.H., Montanari, A., Oh, S.: Matrix completion from noisy entries. J. Mach. Learn. Res. 11, 2057–2078 (2010)

    MathSciNet  MATH  Google Scholar 

  20. Klopp, O.: Noisy low-rank matrix completion with general sampling distribution. Bernoulli 20, 282–303 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  21. Koltchinskii, V., Lounici, K., Tsybakov, A.B.: Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 39, 2302–2329 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  22. Lee, J., Recht, B., Srebro, N., Tropp, J., Salakhutdinov, R.: Practical large-scale optimization for max-norm regularization. In: Proceedings of the Advances in Neural Information Processing Systems (2010)

  23. Linial, N., Mendelson, S., Schechtman, G., Shraibman, A.: Complexity measures of sign matrices. Combinatorica 27, 439–463 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  24. Liu, Z., Vandenberghe, L.: Interior-point method for nuclear norm approximation with application to system identification. SIAM J. Matrix Anal. Appl. 31, 1235–1256 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  25. Mackey, L., Jordan, M.I., Chen, R.Y., Farrell, B., Tropp, J.A.: Matrix concentration inequalities via the method of exchangeable pairs. Ann. Probab. 42, 906–945 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  26. Negahban, S., Wainwright, M.J.: Restricted strong convexity and weighted matrix completion: optimal bounds with noise. J. Mach. Learn. Res. 13, 1665–1697 (2012)

    MathSciNet  MATH  Google Scholar 

  27. Netflix (2006). Netflix problem. http://www.netflixprize.com

  28. Oliveira, D.E., Wolkowicz, H., Xu, Y.: ADMM for the SDP relaxation of the QAP. arXiv preprint arXiv:1512.05448 (2015)

  29. Orabona, F., Argyriou, A., Srebro, N.: PRISMA: Proximal iterative smoothing algorithm. arXiv preprint arXiv:1206.2372 (2012)

  30. Recht, B.: A simpler approach to matrix completion. J. Mach. Learn. Res. 12, 3413–3430 (2011)

    MathSciNet  MATH  Google Scholar 

  31. Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52, 471–501 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  32. Rohde, A., Tsybakov, A.B.: Estimation of high-dimensional low-rank matrices. Ann. Stat. 39, 887–930 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  33. Shen, J., Xu, H., Li, P.: Online optimization for max-norm regularization. Mach. Learn. 106, 419–457 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  34. Srebro, N., Rennie, J., Jaakkola, T.S.: Maximum-margin matrix factorization. In: Proceedings of the Advances in Neural Information Processing Systems (2005)

  35. Srebro, N., Salakhutdinov, R.R.: Collaborative filtering in a non-uniform world: learning with the weighted trace norm. In: Proceedings of the Advances in Neural Information Processing Systems (2010)

  36. Srebro, N., Shraibman, A.: Rank, trace-norm and max-norm. In: Proceedings of the 18th Annual Conference on Learning Theory (2005)

  37. Toh, K.-C., Yun, S.: An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems. Pac. J. Optim. 6, 615–640 (2010)

    MathSciNet  MATH  Google Scholar 

  38. Trefethen, L.N., Bau III, D.: Numerical Linear Algebra, vol. 50. SIAM, Philadelphia (1997)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ethan X. Fang.

Additional information

K.-C. Toh: Research supported in part by Ministry of Education Academic Research Fund R-146-000-194-112.

Extensions

Extensions

In this section, we consider solving the max-norm constrained version of the optimization problem (2.3). In particular, we consider

$$\begin{aligned} \min _{M \in \mathbb {R}^{d_1\times d_2} } \frac{1}{2}\sum _{t=1}^n \big ( Y_{i_t,j_t} - M_{i_t,j_t} \big )^2 + \langle M, I\rangle , \text { subject to }\Vert M\Vert _\infty \le \alpha , \Vert M\Vert _{\max }\le R. \end{aligned}$$
(7.1)

This problem can be formulated as an SDP problem as follows:

$$\begin{aligned} \begin{aligned} \min _{Z\in \mathbb {R}^{d \times d}}&~\frac{1}{2}\sum _{t=1}^n(Y_{i_t,j_t} - Z^{12}_{i_t,j_t})^2 + {\mu \langle I,\, Z\rangle }, \\ \text {subject to }&~ \Vert Z^{12}\Vert _\infty \le \alpha , \ \Vert \mathrm{diag}(Z)\Vert _\infty \le R, \ \ Z\succeq 0. \end{aligned} \end{aligned}$$
(7.2)

Let the loss function be

$$\begin{aligned} \mathcal {L}(Z) = \frac{1}{2} \sum _{t=1}^n \big ( Y_{i_t,j_t} - Z^{12}_{i_t,j_t} \big )^2 + \mu \langle I, Z\rangle . \end{aligned}$$

We define the set

$$\begin{aligned} \mathcal {P}= \{Z\in {\mathcal {S}}^d: \mathrm{diag}(Z)\ge 0, \Vert Z^{11}\Vert _{\infty } \le R, \Vert Z^{22}\Vert _\infty \le R, \Vert Z^{12}\Vert _\infty <\alpha \}. \end{aligned}$$

Thus, we have an equivalent formulation of (7.2) below, which is more conducive for computation:

$$\begin{aligned} \min _{X,Z}\mathcal {L}(Z) + \mu \langle X, I\rangle , \text { subject to }X\succeq 0, \ Z\in \mathcal {P}, X-Z=0. \end{aligned}$$
(7.3)

We consider the augmented Lagrangian function of (7.3) defined by

$$\begin{aligned} L(X,Z;W) =\mathcal {L}(Z) + \langle W,X-Z\rangle + \frac{\rho }{2} \Vert X-Z\Vert _F^2, \ X\in {\mathcal {S}}^d_+, \ Z\in \mathcal {P}, \end{aligned}$$

where W is the dual variable. Then, it is natural to apply the ADMM to solve the problem (7.3). At the t-th iteration, we update (XZW) by

$$\begin{aligned} \begin{aligned} X^{t+1}&= \mathop {\mathrm {argmin}}_{X\in {\mathcal {S}}^d_+} L(X,Z^t;W^t) = \Pi _{{\mathcal {S}}_+^{d}}\big \{Z^t -\rho ^{-1}{(W^t+\mu I)}\big \},\\ Z^{t+1}&= \mathop {\mathrm {argmin}}_{Z\in \mathcal {P}} L(X^{t+1},Z;W^t) = \mathop {\mathrm {argmin}}_{Z\in \mathcal {P}} \mathcal {L}(Z) +\frac{\rho }{2} \Vert Z - X^{t+1}- \rho ^{-1}W^t\Vert _F^2,\\ W^{t+1}&= W^{t} + {\tau } \rho (X^{t+1}-Z^{t+1}), \end{aligned} \end{aligned}$$
(7.4)

The next proposition provides a closed-form solution for the Z-subproblem in (7.4).

Proposition 7.1

Denote the observed set of indices of \(M^0\) by \(\Omega = \{(i_t,j_t)\}_{t=1}^n\). For a given matrix \(C\in \mathbb {R}^{d\times d}\), we have

(7.5)

where

and \(\Pi _{[a,b]}(x) = \min \{b,\max (a,x) \}\) projects \(x\in \mathbb {R}\) to the interval [ab].

We summarize the algorithm for solving the problem (7.2) below.

figure b

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, E.X., Liu, H., Toh, KC. et al. Max-norm optimization for robust matrix recovery. Math. Program. 167, 5–35 (2018). https://doi.org/10.1007/s10107-017-1159-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-017-1159-y

Mathematics Subject Classification

Navigation