Max-norm optimization for robust matrix recovery

Fang, Ethan X.; Liu, Han; Toh, Kim-Chuan; Zhou, Wen-Xin

doi:10.1007/s10107-017-1159-y

Max-norm optimization for robust matrix recovery

Full Length Paper
Series B
Published: 30 May 2017

Volume 167, pages 5–35, (2018)
Cite this article

Mathematical Programming Submit manuscript

Ethan X. Fang¹,
Han Liu²,
Kim-Chuan Toh³ &
…
Wen-Xin Zhou⁴

1304 Accesses
8 Citations
Explore all metrics

Abstract

This paper studies the matrix completion problem under arbitrary sampling schemes. We propose a new estimator incorporating both max-norm and nuclear-norm regularization, based on which we can conduct efficient low-rank matrix recovery using a random subset of entries observed with additive noise under general non-uniform and unknown sampling distributions. This method significantly relaxes the uniform sampling assumption imposed for the widely used nuclear-norm penalized approach, and makes low-rank matrix recovery feasible in more practical settings. Theoretically, we prove that the proposed estimator achieves fast rates of convergence under different settings. Computationally, we propose an alternating direction method of multipliers algorithm to efficiently compute the estimator, which bridges a gap between theory and practice of machine learning methods with max-norm regularization. Further, we provide thorough numerical studies to evaluate the proposed method using both simulated and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust matrix estimations meet Frank–Wolfe algorithm

Article 05 April 2023

Robust matrix completion with complex noise

Article 30 November 2019

Online optimization for max-norm regularization

Article 07 February 2017

References

Abernethy, J., Bach, F., Evgeniou, T., Vert, J.-P.: A new approach to collaborative filtering: operator estimation with spectral regularization. J. Mach. Learn. Res. 10, 803–826 (2009)
MATH Google Scholar
Amit, Y., Fink, M., Srebro, N., Ullman, S.: Uncovering shared structures in multiclass classification. In: Proceedings of the 24th International Conference on Machine Learning. ACM (2007)
Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73, 243–272 (2008)
Article Google Scholar
Bennett, J., Lanning, S.: The Netflix prize. In: Proceedings of KDD Cup and Workshop. http://www.cs.uic.edu/~liub/KDD-cup-2007/proceedings.html (2007)
Biswas, P., Liang, T., Toh, K., Wang, T., Ye, Y.: Semidefinite programming approaches for sensor network localization with noisy distance measurements. IEEE Trans. Autom. Sci. Eng. 3, 360–371 (2006)
Article Google Scholar
Cai, J.-F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20, 1956–1982 (2010)
Article MathSciNet MATH Google Scholar
Cai, T.T., Zhou, W.-X.: Matrix completion via max-norm constrained optimization. Electron. J. Stat. 10, 1493–1525 (2016)
Article MathSciNet MATH Google Scholar
Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58, 1–37 (2009)
Article MathSciNet MATH Google Scholar
Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9, 717–772 (2009)
Article MathSciNet MATH Google Scholar
Candès, E.J., Tao, T.: The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56, 2053–2080 (2010)
Article MathSciNet MATH Google Scholar
Chen, C., He, B., Yuan, X.: Matrix completion via an alternating direction method. IMA J. Numer. Anal. 32, 227–245 (2012)
Article MathSciNet MATH Google Scholar
Doan, X., Vavasis, S.: Finding approximately rank-one submatrices with the nuclear norm and $\ell _1$-norm. SIAM J. Optim. 23, 2502–2540 (2013)
Article MathSciNet MATH Google Scholar
Drusvyatskiy, D., Vavasis, S., Wolkowicz, H.: Extreme point inequalities and geometry of the rank sparsity ball. Math. Program. 152, 521–544 (2015)
Article MathSciNet MATH Google Scholar
Fang, E.X., He, B., Liu, H., Yuan, X.: Generalized alternating direction method of multipliers: new theoretical insights and applications. Math. Prog. Comp. 7, 149–187 (2015)
Article MathSciNet MATH Google Scholar
Fazel, M., Hindi, H., Boyd, S.P.: A rank minimization heuristic with application to minimum order system approximation. In: Proceedings of the American Control Conference, vol. 6. IEEE (2001)
Figueiredo, M., Nowak, R., Wright, S.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process. 1, 586598 (2007)
Article Google Scholar
Jalali, A., Srebro, N.: Clustering using max-norm constrained optimization. In: Proceedings of the 29th International Conference on Machine Learning (ICML-12) (2012)
Jameson, G.J.O.: Summing and Nuclear Norms in Banach Space Theory, vol. 8. Cambridge University Press, Cambridge (1987)
Book MATH Google Scholar
Keshavan, R.H., Montanari, A., Oh, S.: Matrix completion from noisy entries. J. Mach. Learn. Res. 11, 2057–2078 (2010)
MathSciNet MATH Google Scholar
Klopp, O.: Noisy low-rank matrix completion with general sampling distribution. Bernoulli 20, 282–303 (2014)
Article MathSciNet MATH Google Scholar
Koltchinskii, V., Lounici, K., Tsybakov, A.B.: Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 39, 2302–2329 (2011)
Article MathSciNet MATH Google Scholar
Lee, J., Recht, B., Srebro, N., Tropp, J., Salakhutdinov, R.: Practical large-scale optimization for max-norm regularization. In: Proceedings of the Advances in Neural Information Processing Systems (2010)
Linial, N., Mendelson, S., Schechtman, G., Shraibman, A.: Complexity measures of sign matrices. Combinatorica 27, 439–463 (2007)
Article MathSciNet MATH Google Scholar
Liu, Z., Vandenberghe, L.: Interior-point method for nuclear norm approximation with application to system identification. SIAM J. Matrix Anal. Appl. 31, 1235–1256 (2009)
Article MathSciNet MATH Google Scholar
Mackey, L., Jordan, M.I., Chen, R.Y., Farrell, B., Tropp, J.A.: Matrix concentration inequalities via the method of exchangeable pairs. Ann. Probab. 42, 906–945 (2014)
Article MathSciNet MATH Google Scholar
Negahban, S., Wainwright, M.J.: Restricted strong convexity and weighted matrix completion: optimal bounds with noise. J. Mach. Learn. Res. 13, 1665–1697 (2012)
MathSciNet MATH Google Scholar
Netflix (2006). Netflix problem. http://www.netflixprize.com
Oliveira, D.E., Wolkowicz, H., Xu, Y.: ADMM for the SDP relaxation of the QAP. arXiv preprint arXiv:1512.05448 (2015)
Orabona, F., Argyriou, A., Srebro, N.: PRISMA: Proximal iterative smoothing algorithm. arXiv preprint arXiv:1206.2372 (2012)
Recht, B.: A simpler approach to matrix completion. J. Mach. Learn. Res. 12, 3413–3430 (2011)
MathSciNet MATH Google Scholar
Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52, 471–501 (2010)
Article MathSciNet MATH Google Scholar
Rohde, A., Tsybakov, A.B.: Estimation of high-dimensional low-rank matrices. Ann. Stat. 39, 887–930 (2011)
Article MathSciNet MATH Google Scholar
Shen, J., Xu, H., Li, P.: Online optimization for max-norm regularization. Mach. Learn. 106, 419–457 (2017)
Article MathSciNet MATH Google Scholar
Srebro, N., Rennie, J., Jaakkola, T.S.: Maximum-margin matrix factorization. In: Proceedings of the Advances in Neural Information Processing Systems (2005)
Srebro, N., Salakhutdinov, R.R.: Collaborative filtering in a non-uniform world: learning with the weighted trace norm. In: Proceedings of the Advances in Neural Information Processing Systems (2010)
Srebro, N., Shraibman, A.: Rank, trace-norm and max-norm. In: Proceedings of the 18th Annual Conference on Learning Theory (2005)
Toh, K.-C., Yun, S.: An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems. Pac. J. Optim. 6, 615–640 (2010)
MathSciNet MATH Google Scholar
Trefethen, L.N., Bau III, D.: Numerical Linear Algebra, vol. 50. SIAM, Philadelphia (1997)
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Department of Industrial and Manufacturing Engineering, Pennsylvania State University, University Park, PA, 16802, USA
Ethan X. Fang
Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ, 08544, USA
Han Liu
Department of Mathematics, National University of Singapore, 10 Lower Kent Ridge Road, Singapore, 119076, Singapore
Kim-Chuan Toh
Department of Mathematics, University of California, San Diego, La Jolla, CA, 92093, USA
Wen-Xin Zhou

Authors

Ethan X. Fang
View author publications
You can also search for this author in PubMed Google Scholar
Han Liu
View author publications
You can also search for this author in PubMed Google Scholar
Kim-Chuan Toh
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Xin Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ethan X. Fang.

Additional information

K.-C. Toh: Research supported in part by Ministry of Education Academic Research Fund R-146-000-194-112.

Extensions

In this section, we consider solving the max-norm constrained version of the optimization problem (2.3). In particular, we consider

$$\begin{aligned} \min _{M \in \mathbb {R}^{d_1\times d_2} } \frac{1}{2}\sum _{t=1}^n \big ( Y_{i_t,j_t} - M_{i_t,j_t} \big )^2 + \langle M, I\rangle , \text { subject to }\Vert M\Vert _\infty \le \alpha , \Vert M\Vert _{\max }\le R. \end{aligned}$$

(7.1)

This problem can be formulated as an SDP problem as follows:

$$\begin{aligned} \begin{aligned} \min _{Z\in \mathbb {R}^{d \times d}}&~\frac{1}{2}\sum _{t=1}^n(Y_{i_t,j_t} - Z^{12}_{i_t,j_t})^2 + {\mu \langle I,\, Z\rangle }, \\ \text {subject to }&~ \Vert Z^{12}\Vert _\infty \le \alpha , \ \Vert \mathrm{diag}(Z)\Vert _\infty \le R, \ \ Z\succeq 0. \end{aligned} \end{aligned}$$

(7.2)

Let the loss function be

$$\begin{aligned} \mathcal {L}(Z) = \frac{1}{2} \sum _{t=1}^n \big ( Y_{i_t,j_t} - Z^{12}_{i_t,j_t} \big )^2 + \mu \langle I, Z\rangle . \end{aligned}$$

We define the set

$$\begin{aligned} \mathcal {P}= \{Z\in {\mathcal {S}}^d: \mathrm{diag}(Z)\ge 0, \Vert Z^{11}\Vert _{\infty } \le R, \Vert Z^{22}\Vert _\infty \le R, \Vert Z^{12}\Vert _\infty <\alpha \}. \end{aligned}$$

Thus, we have an equivalent formulation of (7.2) below, which is more conducive for computation:

$$\begin{aligned} \min _{X,Z}\mathcal {L}(Z) + \mu \langle X, I\rangle , \text { subject to }X\succeq 0, \ Z\in \mathcal {P}, X-Z=0. \end{aligned}$$

(7.3)

We consider the augmented Lagrangian function of (7.3) defined by

$$\begin{aligned} L(X,Z;W) =\mathcal {L}(Z) + \langle W,X-Z\rangle + \frac{\rho }{2} \Vert X-Z\Vert _F^2, \ X\in {\mathcal {S}}^d_+, \ Z\in \mathcal {P}, \end{aligned}$$

where W is the dual variable. Then, it is natural to apply the ADMM to solve the problem (7.3). At the t-th iteration, we update (X, Z; W) by

$$\begin{aligned} \begin{aligned} X^{t+1}&= \mathop {\mathrm {argmin}}_{X\in {\mathcal {S}}^d_+} L(X,Z^t;W^t) = \Pi _{{\mathcal {S}}_+^{d}}\big \{Z^t -\rho ^{-1}{(W^t+\mu I)}\big \},\\ Z^{t+1}&= \mathop {\mathrm {argmin}}_{Z\in \mathcal {P}} L(X^{t+1},Z;W^t) = \mathop {\mathrm {argmin}}_{Z\in \mathcal {P}} \mathcal {L}(Z) +\frac{\rho }{2} \Vert Z - X^{t+1}- \rho ^{-1}W^t\Vert _F^2,\\ W^{t+1}&= W^{t} + {\tau } \rho (X^{t+1}-Z^{t+1}), \end{aligned} \end{aligned}$$

(7.4)

The next proposition provides a closed-form solution for the Z-subproblem in (7.4).

Proposition 7.1

Denote the observed set of indices of $M^0$ by $\Omega = \{(i_t,j_t)\}_{t=1}^n$. For a given matrix $C\in \mathbb {R}^{d\times d}$, we have

(7.5)

where

and $\Pi _{[a,b]}(x) = \min \{b,\max (a,x) \}$ projects $x\in \mathbb {R}$ to the interval [a, b].

We summarize the algorithm for solving the problem (7.2) below.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, E.X., Liu, H., Toh, KC. et al. Max-norm optimization for robust matrix recovery. Math. Program. 167, 5–35 (2018). https://doi.org/10.1007/s10107-017-1159-y

Download citation

Received: 02 September 2015
Accepted: 02 May 2017
Published: 30 May 2017
Issue Date: January 2018
DOI: https://doi.org/10.1007/s10107-017-1159-y

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Max-norm optimization for robust matrix recovery

Abstract

Access this article

Similar content being viewed by others

Robust matrix estimations meet Frank–Wolfe algorithm

Robust matrix completion with complex noise

Online optimization for max-norm regularization

References