Machine Learning

, Volume 107, Issue 3, pp 579–603 | Cite as

1-Bit matrix completion: PAC-Bayesian analysis of a variational approximation

  • Vincent Cottet
  • Pierre Alquier


We focus on the completion of a (possibly) low-rank matrix with binary entries, the so-called 1-bit matrix completion problem. Our approach relies on tools from machine learning theory: empirical risk minimization and its convex relaxations. We propose an algorithm to compute a variational approximation of the pseudo-posterior. Thanks to the convex relaxation, the corresponding minimization problem is bi-convex, and thus the method works well in practice. We study the performance of this variational approximation through PAC-Bayesian learning bounds. Contrary to previous works that focused on upper bounds on the estimation error of M with various matrix norms, we are able to derive from this analysis a PAC bound on the prediction error of our algorithm. We focus essentially on convex relaxation through the hinge loss, for which we present a complete analysis, a complete simulation study and a test on the MovieLens data set. We also discuss a variational approximation to deal with the logistic loss.


Matrix completion PAC-Bayesian bounds Variational Bayes Supervised classification Risk convexification Oracle inequalities 



We would like to thank Vincent Cottet’s Ph.D. supervisor Professor Nicolas Chopin, for his kind support during the project and the three anonymous referees for their helpful and constructive comments.


  1. Alquier, P., Cottet, V., & Lecué, G. (2017). Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions. arXiv preprint arXiv:1702.01402.
  2. Alquier, P., Ridgway, J., & Chopin, N. (June 2015). On the properties of variational approximations of Gibbs posteriors. arXiv e-prints.Google Scholar
  3. Bishop, C. M. (2006). Pattern recognition and machine learning (information science and statistics). New York: Springer.zbMATHGoogle Scholar
  4. Boucheron, S., Lugosi, G., & Massart, P. (2013). Concentration inequalities: A nonasymptotic theory of independence. Oxford: OUP.CrossRefzbMATHGoogle Scholar
  5. Boyd, S., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122.CrossRefzbMATHGoogle Scholar
  6. Cai, T., & Zhou, W.-X. (2013). A max-norm constrained minimization approach to 1-bit matrix completion. Journal of Machine Learning Research, 14, 3619–3647.MathSciNetzbMATHGoogle Scholar
  7. Candès, E. J., & Plan, Y. (2010). Matrix completion with noise. Proceedings of the IEEE, 98(6), 925–936.CrossRefGoogle Scholar
  8. Candès, E. J., & Recht, B. (2012). Exact matrix completion via convex optimization. Communications of the ACM, 55(6), 111–119.CrossRefzbMATHGoogle Scholar
  9. Candès, E. J., & Tao, T. (2010). The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5), 2053–2080.MathSciNetCrossRefzbMATHGoogle Scholar
  10. Catoni, O. (2004). Statistical learning theory and stochastic optimization. In J. Picard (Ed.), Saint-Flour Summer School on probability theory 2001., Lecture notes in mathematics Berlin: Springer.Google Scholar
  11. Catoni, O. (2007). PAC-Bayesian supervised classification: The thermodynamics of statistical learning (Vol. 56)., Institute of mathematical statistics lecture notes—Monograph series Beachwood, OH: Institute of Mathematical Statistics.zbMATHGoogle Scholar
  12. Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Annals of Statistics, 43(1), 177–214.MathSciNetCrossRefzbMATHGoogle Scholar
  13. Dalalyan, A., & Tsybakov, A. B. (2008). Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity. Machine Learning, 72(1), 39–61.CrossRefGoogle Scholar
  14. Davenport, M. A., Plan, Y., van den Berg, E., & Wootters, M. (2014). 1-Bit matrix completion. Information and Inference, 3(3), 189–223.MathSciNetCrossRefzbMATHGoogle Scholar
  15. Herbrich, R., & Graepel, T. (2002). A PAC-Bayesian margin bound for linear classifiers. IEEE Transactions on Information Theory, 48(12), 3140–3150.MathSciNetCrossRefzbMATHGoogle Scholar
  16. Herbster, M., Pasteris, S., & Pontil, M. (2016). Mistake bounds for binary matrix completion. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, R. Garnett, & R. Garnett (Eds.), Proceedings of the 29th conference on neural information processing systems (NIPS 2016). Barcelona, Spain: NIPS Proceedings.Google Scholar
  17. Hsieh, C.-J., Natarajan, N., & Dhillon, I. S. (2015). PU learning for matrix completion. In Proceedings of the 32nd international conference on machine learning, pp. 2445–2453.Google Scholar
  18. Jaakkola, T. S., & Jordan, M. I. (2000). Bayesian parameter estimation via variational methods. Statistics and Computing, 10(1), 25–37.CrossRefGoogle Scholar
  19. Klopp, O., Lafond, J., Moulines, É., & Salmon, J. (2015). Adaptive multinomial matrix completion. Electronic Journal of Statistics, 9(2), 2950–2975.MathSciNetCrossRefzbMATHGoogle Scholar
  20. Kyung, M., Gill, J., Ghosh, M., & Casella, G. (2010). Penalized regression, standard errors, and Bayesian lassos. Bayesian Analysis, 5(2), 369–412.MathSciNetCrossRefzbMATHGoogle Scholar
  21. Latouche, P., Robin, S., & Ouadah, S. (2015). Goodness of fit of logistic models for random graphs. arXiv preprint arXiv:1508.00286.
  22. Lim, Y. J. & Teh, Y. W. (2007). Variational Bayesian approach to movie rating prediction. In Proceedings of KDD cup and workshop.Google Scholar
  23. Mai, T. T., & Alquier, P. (2015). A bayesian approach for noisy matrix completion: Optimal rate under general sampling distribution. Electronic Journal of Statistics, 9(1), 823–841.MathSciNetCrossRefzbMATHGoogle Scholar
  24. Mammen, E., & Tsybakov, A. (1999). Smooth discrimination analysis. The Annals of Statistics, 27(6), 1808–1829.MathSciNetCrossRefzbMATHGoogle Scholar
  25. Mazumder, R., Hastie, T., & Tibshirani, R. (2010). Spectral regularization algorithms for learning large incomplete matrices. Journal of Machine Learning Research, 11(Aug), 2287–2322.MathSciNetzbMATHGoogle Scholar
  26. McAllester, D. A. (1998). Some PAC-Bayesian theorems. In Proceedings of the eleventh annual conference on computational learning theory (pp. 230–234). New York, ACM.Google Scholar
  27. Park, T., & Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association, 103(482), 681–686.MathSciNetCrossRefzbMATHGoogle Scholar
  28. Recht, B., & Ré, C. (2013). Parallel stochastic gradient algorithms for large-scale matrix completion. Mathematical Programming Computation, 5(2), 201–226.MathSciNetCrossRefzbMATHGoogle Scholar
  29. Salakhutdinov, R. & Mnih, A. (2008). Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th international conference on machine learning, pp. 880–887.Google Scholar
  30. Seldin, Y., & Tishby, N. (2010). PAC-Bayesian analysis of co-clustering and beyond. Journal of Machine Learning Research, 11(Dec), 3595–3646.MathSciNetzbMATHGoogle Scholar
  31. Seldin, Y., Laviolette, F., Cesa-Bianchi, N., Shawe-Taylor, J., & Auer, P. (2012). PAC-Bayesian inequalities for martingales. IEEE Transactions on Information Theory, 58(12), 7086–7093.MathSciNetCrossRefzbMATHGoogle Scholar
  32. Shawe-Taylor, J., & Langford, J. (2003). PAC-Bayes and margins. Advances in Neural Information Processing Systems, 15, 439.Google Scholar
  33. Srebro, N., Rennie, J., & Jaakkola, T. S. (2004). Maximum-margin matrix factorization. In Advances in neural information processing systems, pp. 1329–1336.Google Scholar
  34. Vapnik, V. (1998). Statistical learning theory. New York: Wiley.zbMATHGoogle Scholar
  35. Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics, 32(1), 56–85.MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.CREST, ENSAEUniversité Paris SaclayParisFrance

Personalised recommendations