Automatic Regularization of Factorization Models

  • Steffen Rendle
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


Many recent machine learning approaches for prediction problems over categorical variables are based on factorization models, e.g. matrix or tensor factorization. Due to the large number of model parameters, factorization models are prone to overfitting and typically Gaussian priors are applied for regularization. Finding proper values for the regularization parameters is usually done with an expensive grid-search using holdout validation data. In this work, two approaches are presented where regularization values are found without increasing computational complexity. The first one is based on interweaving optimization of model parameters and regularization in stochastic gradient descent algorithms. Secondly, a two-level Bayesian model to integrate regularization values into inference is shortly discussed.


  1. Freudenthaler C, Schmidt-Thieme L, Rendle S (2011) Bayesian factorization machines. In: NIPS workshop on sparse representation and low-rank approximation, Sierra NevadaGoogle Scholar
  2. Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: KDD ’08: proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, Las Vegas. ACM, New York, pp 426–434Google Scholar
  3. Koren Y (2010) Factor in the neighbors: scalable and accurate collaborative filtering. ACM Trans Knowl Discov Data 4:1:1–1:24Google Scholar
  4. Rendle S (2010) Factorization machines. In: Proceedings of the 2010 IEEE international conference on data mining, ICDM ’10, Sydney. IEEE Computer Society, Washington, DC, pp 995–1000Google Scholar
  5. Rendle S (2012) Learning recommender systems with adaptive regularization. In: Proceedings of the fifth ACM international conference on web search and data mining, WSDM ’12, Seattle. ACM, New York, pp 133–142Google Scholar
  6. Salakhutdinov R, Mnih A (2008) Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In: Proceedings of the 25th international conference on machine learning, ICML ’08, Helsinki. ACM, New York, pp 880–887Google Scholar
  7. Srebro N, Rennie JDM, Jaakkola TS (2005) Maximum-margin matrix factorization. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems 17, Vancouver. MIT, Cambridge, pp 1329–1336Google Scholar
  8. Tucker LR (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31:279–311MathSciNetCrossRefGoogle Scholar
  9. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B (Stat Methodol) 67(2):301–320MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.University of KonstanzKonstanzGermany

Personalised recommendations