Skip to main content

Speeding up ALS learning via approximate methods for context-aware recommendations


Implicit feedback-based recommendation problems, typically set in real-world applications, recently have been receiving more attention in the research community. From the practical point of view, scalability of such methods is crucial. However, factorization-based algorithms efficient in explicit rating data applied directly to implicit data are computationally inefficient; therefore, different techniques are needed to adapt to implicit feedback. For alternating least squares (ALS) learning, several research contributions have proposed efficient adaptation techniques for implicit feedback. These algorithms scale linearly with the number of nonzero data points, but cubically in the number of features, which is a computational bottleneck that prevents the efficient usage of accurate high factor models. Also, map-reduce type big data techniques are not viable with ALS learning, because there is no known technique that solves the high communication overhead required for random access of the feature matrices. To overcome this drawback, here we present two generic approximate variants for fast ALS learning, using conjugate gradient (CG) and coordinate descent (CD). Both CG and CD can be coupled with all methods using ALS learning. We demonstrate the advantages of fast ALS variants on iTALS, a generic context-aware algorithm, which applies ALS learning for tensor factorization on implicit data. In the experiments, we compare the approximate techniques with the base ALS learning in terms of training time, scalability, recommendation accuracy, and convergence. We show that the proposed solutions offer a trade-off between recommendation accuracy and speed of training time; this makes it possible to apply ALS-based methods efficiently even for billions of data points.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7


  1. User purchased an item or viewed an product page, etc. Interactions also called events or transactions.

  2. It is beneficial if the data are stored in the shared memory as well, but it can be stored on disk as well, if properly indexed.

  3. Here we assumed a relatively high density of \({\sim }1\,\%\), 100 K for users and 45 K for items that is realistic for \({\sim }45\) M record.

  4. With proper weighting scheme, the iTALS could be used with explicit feedback as well.

  5. \(DN^+=\sum _{i=1}^{D}{S_i}\) means that we only have one event/example for each user, for each item and each context state. In this case, CF method are not applicable due to sparseness.

  6. The complexity of Algorithm 4.1 is \(O(N_EN_IK)\) that is \(O\left( (K^2+N^+_jK)N_I\right) \) in our case for one feature vector.

  7. Data were collected by the service provider of an online grocery store and a vod store, respectively, by monitoring the purchases in the system. There were no recommender systems active during the data collection period.

  8. This value is 1.0 at TV1 and TV2. This is possibly due to preprocessing by the original authors that removed duplicate events.

  9. The actual speedup and improvement in scalability depend on the efficiency of certain key steps (e.g., matrix-vector multiplication for CG). These may differ from algorithm to algorithm.

  10. With fixed list length and test set, these values are proportional to the recall@20 value.

  11. In the following sense: \(N_I\) values relative to the number of features. That is, if K is lower/higher, then approximate methods reach the training time of ALS at lower/higher \(N_I\) values.


  1. Adomavicius G, Ricci F (2009) Workshop on context-aware recommender systems (CARS-2009). In: Recsys’09: ACM conference on recommender systems, pp 423–424

  2. Adomavicius G, Sankaranarayanan R, Sen S, Tuzhilin A (2005) Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans Inf Syst 23(1):103–145

    Article  Google Scholar 

  3. Adomavicius G, Tuzhilin A (2008) Context-aware recommender systems. In: Recsys’08: ACM conference on recommender systems, pp 335–336

  4. Bader R, Neufeld E, Woerndl W, Prinz V (2011) Context-aware POI recommendations in an automotive scenario using multi-criteria decision making methods. In: CaRR’11: workshop on context-awareness in retrieval and recommendation, pp 23–30

  5. Balassi M, Pálovics R, Benczúr AA (2014) Distributed frameworks for alternating least squares. In: Proceedings of the 2nd large scale recommender systems workshop at recsys 2014, Foster City

  6. Celma O (2010) Music recommendation and discovery in the long tail. Springer, New York

    Book  Google Scholar 

  7. Cremonesi P, Turrin R (2009) Analysis of cold-start recommendations in IPTV systems. In: Recsys’09: ACM conference on recommender systems

  8. Dias R, Fonseca MJ (2013) Improving music recommendation in session-based collaborative filtering by using temporal context. In: 2013 IEEE 25th international conference on tools with artificial intelligence (ICTAI), IEEE, pp 783–788

  9. Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Natl Bur Stand 49(6):409–436

  10. Hidasi B (2014) Factorization models for context-aware recommendations. Infocommun J VI(4):27–34

    Google Scholar 

  11. Hidasi B, Tikk D (2012) Fast ALS-based tensor factorization for context-aware recommendation from implicit feedback. ECML-PKDD’12, Part II’, number 7524 in ‘LNCS. Springer, New York, pp 67–82

    Google Scholar 

  12. Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: ICDM’08: IEEE international conference on data mining, pp 263–272

  13. Jahrer M, Töscher A (2011) Collaborative filtering ensemble for ranking. In: KDD Cup Workshop at 17th ACM SIGKDD’11

  14. Karatzoglou A, Amatriain X, Baltrunas L, Oliver N (2010) Multiverse recommendation: N-dimensional tensor factorization for context-aware collaborative filtering. In: Recsys’10: ACM conference on recommender systems, pp 79–86

  15. Koren Y, Bell R (2011) Advances in collaborative filtering. In: Ricci F et al (eds) Recommender systems handbook. Springer, New York, pp 145–186

    Chapter  Google Scholar 

  16. Lathauwer L, Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4):1253–1278

    MathSciNet  Article  MATH  Google Scholar 

  17. Little RJA, Rubin DB (1987) Statistical analysis with missing data. Willey, Hoboken

    MATH  Google Scholar 

  18. Liu NN, Zhao BCM, Yang Q (2010) Adapting neighborhood and matrix factorization models for context aware recommendation. In: CAMRa’10: workshop on context-aware movie recommendation, pp 7–13

  19. Liu Q, Chen T, Cai J, Yu D (2012) Enlister: baidu’s recommender system for the biggest Chinese Q&A website. In: RecSys-12: proceedings of the 6th ACM conference on recommender systems, pp 285–288

  20. Lommatzsch A (2014) Real-time news recommendation using context-aware ensembles. In: de Rijke M, Kenter T, de Vries A, Zhai C, de Jong F, Radinsky K, Hofmann K (eds) Advances in information retrieval of lecture notes in computer science, vol 8416. Springer, New York, pp 51–62

    Chapter  Google Scholar 

  21. Nguyen TV, Karatzoglou A, Baltrunas L (2014) Gaussian process factorization machines for context-aware recommendations. In: SIGIR-14: ACM SIGIR conference on research and development in information retrieval, pp 63–72

  22. Pan R, Zhou Y, Cao B, Liu NN, Lukose RM, Scholz M, Yang Q (2008) One-class collaborative filtering. In: ICMD’08: 8th IEEE international conference on data mining, pp 502–511

  23. Pilászy I, Zibriczky D, Tikk D (2010) Fast ALS-based matrix factorization for explicit and implicit feedback datasets. In: Recsys’10: ACM conference on recommender systems, pp 71–78

  24. Rendle S (2012) Factorization machines with libFM. ACM Trans Intell Syst Technol (TIST) 3(3):57

    Google Scholar 

  25. Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) BPR: Bayesian personalized ranking from implicit feedback. In: UAI’09: 25th conference on uncertainty in artificial intelligence, pp 452–461

  26. Rendle S, Gantner Z, Freudenthaler C, Schmidt-Thieme L (2011) Fast context-aware recommendations with factorization machines. In: SIGIR’11: ACM international conference on research and development in information, pp 635–644

  27. Rendle S, Schmidt-Thieme L (2010) Pairwise interaction tensor factorization for personalized tag recommendation. In: WSDM’10: ACM international conference on web search and data mining, pp 81–90

  28. Ricci F (ed) (2011) Recommender systems handbook. Springer, New York

    MATH  Google Scholar 

  29. Said A, Berkovsky S, Luca EWD (2010) Putting things in context: challenge on context-aware movie recommendation. In: CAMRa’10: workshop on context-aware movie recommendation, pp 2–6

  30. Shi Y, Karatzoglou A, Baltrunas L, Larson M, Hanjalic A, Oliver N (2012) TFMAP: optimizing MAP for top-N context-aware recommendation. SIGIR’12: ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 155–164

    Chapter  Google Scholar 

  31. Takács G, Pilászy I, Tikk D (2011) Applications of the conjugate gradient method for implicit feedback collaborative filtering. In: RecSys’11: ACM conference on recommender systems, pp 297–300

  32. Takács G, Tikk D (2012) Alternating least squares for personalized ranking. In: Recsys’12: 6th ACM conference on recommender systems, pp 83–90

  33. Zarka R, Cordier A, Egyed-Zsigmond E, Mille A (2012) Contextual trace-based video recommendations. In: Proceedings of the 21st international conference companion on world wide web, WWW ’12 Companion, ACM, New York, pp 751–754. doi:10.1145/2187980.2188196

Download references


The work leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under CrowdRec Grant Agreement No. 610594. The authors would like to thank Martha Larson for her useful comments on the paper.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Balázs Hidasi.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hidasi, B., Tikk, D. Speeding up ALS learning via approximate methods for context-aware recommendations. Knowl Inf Syst 47, 131–155 (2016).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Recommender systems
  • Tensor factorization
  • Context awareness
  • Implicit feedback
  • Scalability
  • Comparison