Skip to main content
Log in

Parallel stochastic gradient algorithms for large-scale matrix completion

  • Full Length Paper
  • Published:
Mathematical Programming Computation Aims and scope Submit manuscript

Abstract

This paper develops Jellyfish, an algorithm for solving data-processing problems with matrix-valued decision variables regularized to have low rank. Particular examples of problems solvable by Jellyfish include matrix completion problems and least-squares problems regularized by the nuclear norm or \(\gamma _2\)-norm. Jellyfish implements a projected incremental gradient method with a biased, random ordering of the increments. This biased ordering allows for a parallel implementation that admits a speed-up nearly proportional to the number of processors. On large-scale matrix completion tasks, Jellyfish is orders of magnitude more efficient than existing codes. For example, on the Netflix Prize data set, prior art computes rating predictions in approximately 4 h, while Jellyfish solves the same problem in under 3 min on a 12 core workstation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. We were also unable to find any implementations of similar algorithms written in a low-level language such as C. While coding directly in C or Fortran would likely yield a version of NNLS which is considerably faster than the Matlab version, we doubt that it would yield the hundred-fold speedups necessary to be competitive with Jellyfish.

References

  1. Alon, N., Naor, A.: Approximating the cut-norm via Grothendieck’s inequality. SIAM J. Comput. 35(4), 787–803 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bach, F.R., Marial, J., Ponce, J.: Convex sparse matrix factorizations (2008). Preprint available at http://arxiv.org/abs/0812.1869

  3. Balzano, L., Nowak, R., Recht, B.: Online identification and tracking of subspaces from highly incomplete information. In: Proceedings of the 48th Annual Allerton Conference (2010)

  4. Bertsekas, D.P.: A hybrid incremental gradient method for least squares. SIAM J. Optim. 7, 913–925 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bertsekas, D.P.: Nonlinear Program., 2nd edn. Athena Scientific, Belmont (1999)

    Google Scholar 

  6. Bottou, L., Bousquet, O.: The tradeoffs of large scale learning. Neural Information Processing Systems (2008)

  7. Burer, S., Monteiro, R.D.C.: Local minima and convergence in low-rank semidefinite programming. Math. Program. 103(3), 427–444 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  8. Cai, J.F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2008)

    Article  Google Scholar 

  9. Candès, E., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Candès, E.J., Tao, T.: The power of convex relaxation: Near-optimal matrix completion. IEEE Trans. Inf. Theory 56(5), 2053–2080 (2009)

    Article  Google Scholar 

  11. Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Sim. 4(4), 1168–1200 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  12. Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Mathe. Program. Ser. A 91, 201–213 (2002)

    Article  MATH  Google Scholar 

  13. Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (1999)

  14. Funk, S.: Netflix update: Try this at home (2006). http://sifter.org/simon/journal/20061211.html

  15. Gross, D.: Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inf. Theory 57, 1548–1566 (2011)

    Article  Google Scholar 

  16. Jameson, G.J.O.: Summing and Nuclear Norms in Banach Space Theory. No. 8 in London Mathematical Society Student Texts. Cambridge University Press, Cambridge, UK (1987)

  17. Ji, S., Ye, J.: An accelerated gradient method for trace norm minimization. Proceedings of the ICML (2009)

  18. Keshavan, R.H., Montanari, A., Oh, S.: Matrix completion from a few entries. IEEE Trans. Inf. Theory 56(6), 2980–2998 (2009)

    Article  MathSciNet  Google Scholar 

  19. Knuth, D.E.: The Art of Computer Programming, 2nd edn. Addison-Wesley Professional, Boston (1998)

    Google Scholar 

  20. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  21. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)

    Article  Google Scholar 

  22. Lee, J., Recht, B., Srebro, N., Salakhutdinov, R.R., Tropp, J.A.: Practical large-scale optimization for max-norm regularization. In: Advances in Neural Information Processing Systems (2010)

  23. Liu, Z., Vandenberghe, L.: Interior-point method for nuclear norm approximation with application to system identification. SIAM J. Matrix Anal. Appl. 31(3), 1235–1256 (2009)

    Article  MathSciNet  Google Scholar 

  24. Luo, Z.Q.: On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks. Neural Comput. 3(2), 226–245 (1991)

    Article  Google Scholar 

  25. Luo, Z.Q., Tseng, P.: Analysis of an approximate gradient projection method with applications to the backpropagation algorithm. Optim. Methods Softw. 4, 85–101 (1994)

    Article  Google Scholar 

  26. Ma, S., Goldfarb, D., Chen, L.: Fixed point and Bregman iterative methods for matrix rank minimization. Mathematical Programming pp. 1–33 (2009). Published online first at http://dx.doi.org/10.1007/s10107-009-0306-5

  27. Mangasarian, O.L., Solodov, M.V.: Serial and parallel backpropagation convergence via nonmonotone perturbed minimization. Optim. Methods Softw. 4, 103–116 (1994)

    Article  Google Scholar 

  28. Nedic A., Bertsekas, D.P.: Convergence rate of incremental subgradient algorithms. In: Uryasev, S., Pardalos, P.M. (eds) Stochastic Optimization: Algorithms and Applications, Kluwer Academic Publishers, pp. 263–304 (2000)

  29. Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  30. Nesterov, Y.: Gradient methods for minimizing composite functions. Tech. rep., CORE Discussion Paper (2007). Preprint available at http://www.optimization-online.org/DB_HTML/2007/09/1784.html

  31. Recht, B.: A simpler approach to matrix completion. J. Mach. Learn. Res. 12, 3413–3430 (2011)

    MathSciNet  Google Scholar 

  32. Recht, B., Fazel, M., Parrilo, P.: Guaranteed minimum rank solutions of matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  33. Rennie, J.D.M., Srebro, N.: Fast maximum margin matrix factorization for collaborative prediction. Proceedings of the International Conference of Machine Learning (2005)

  34. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  35. Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Advances in Neural Information Processing Systems (2008)

  36. Srebro, N., Rennie, J., Jaakkola, T.: Maximum margin matrix factorization. In: Advances in Neural Information Processing Systems (2004)

  37. Srebro, N., Shraibman, A.: Rank, trace-norm and max-norm. In: 18th Annual Conference on Learning Theory (COLT) (2005)

  38. Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(1), 136–144 (2002)

    Article  Google Scholar 

  39. Toh, K.C., Yun, S.: An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. Pacific J. Math. 6, 615–640 (2010)

    MathSciNet  MATH  Google Scholar 

  40. Tseng, P.: An incremental gradient(-projection) method with momentum term and adaptive stepsize rule. SIAM J. Optim. 8(2), 506–531 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  41. Wen, Z., Yin, W., Zhang, Y.: Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Tech. rep., Rice University, CAAM Technical, Report TR10-07 (2010)

Download references

Acknowledgments

This work was supported in part by ONR Contract N00014-11-M-0478. BR is additionally supported by ONR award N00014-11-1-0723 and NSF award CCF-1139953. CR is additionally supported by the Air Force Research Laboratory (AFRL) under prime contract no. FA8750-09-C-0181, the NSF CAREER award under IIS-1054009, ONR award N000141210041, and gifts or research awards from Google, Greenplum, Johnson Controls, Inc., LogicBlox, and Oracle. Any opinions, findings, and conclusion or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of any of the above sponsors including DARPA, AFRL, or the US government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin Recht.

Appendix

Appendix

See the Tables 3, 4, 5, 6, 7, 8, 9, 10, 11.

Table 3 \(1{,}000 \times 1{,}000\)
Table 4 \(1{,}000 \times 5{,}000\)
Table 5 \(1{,}000 \times 10{,}000\)
Table 6 \(10{,}000 \times 10{,}000\)
Table 7 \(10{,}000 \times 50{,}000\)
Table 8 \(10{,}000 \times 100{,}000\)
Table 9 \(100{,}000 \times 100{,}000\)
Table 10 \(100{,}000 \times 500{,}000\)
Table 11 \(100{,}000 \times 1{,}000{,}000\)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Recht, B., Ré, C. Parallel stochastic gradient algorithms for large-scale matrix completion. Math. Prog. Comp. 5, 201–226 (2013). https://doi.org/10.1007/s12532-013-0053-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12532-013-0053-8

Keywords

Mathematics Subject Classification

Navigation