Skip to main content
Log in

Analyzing and improving stability of matrix factorization for recommender systems

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript


Thanks to their flexibility and scalability, collaborative embedding-based models are widely employed for the top-N recommendation task. Their goal is to jointly represent users and items in a common low-dimensional embedding space where users are represented close to items for which they expressed a positive preference. The training procedure of these techniques is influenced by several sources of randomness, that can have a strong impact on the embeddings learned by the models. In this paper we analyze this impact on Matrix Factorization (MF). In particular, we focus on the effects of training the same model on the same data, but with different initial values for the latent representations of users and items. We perform several experiments employing three well known MF implementations over five datasets. We show that different random initializations lead the same MF technique to generate very different latent representations and recommendation lists. We refer to these inconsistencies as instability of representations and instability of recommendations, respectively. We report that stability of item representations is positively correlated to the accuracy of the model. We show that the stability issues affect also the items for which the recommender correctly predicts positive preferences. Moreover, we highlight that the effect is stronger for less popular items. To overcome these drawbacks, we present a generalization of MF called Nearest Neighbors Matrix Factorization (NNMF). The new framework learns the embedding of each user and item as a weighted linear combination of the representations of the respective nearest neighbors. This strategy has the effect to propagate the information about items and users also to their neighbors and allows the embeddings of users and items with few interactions to be supported by a higher amount of information. To empirically demonstrate the advantages of the new framework, we provide a detailed description of the NNMF variants of three common MF techniques. We show that NNMF models, compared to their MF counterparts, largely improve the stability of both representations and recommendations, obtain a higher and more stable accuracy performance, especially on long-tail items, and reach convergence in a fraction of epochs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Code Availability

The source code used to perform all the experiments, including data splitting, hyper-parameter optimization, stability and accuracy performance evaluation, is publicly available. The link is the following:


  1. The definition of repeatability is reported in the ACM Artifact Review and Badging guidelines:


  3. Note that instances that use the same random seed for the initialization would converge to the exact same results.


  5. The long-tail is the set of least popular items that account for the 66% of the interactions. The short-head is defined as complementary to the long-tail: it is the set of most popular items that account for the 34% of the interactions

  6. In our experiments, the number k of nearest neighbors was treated as a hyperparameter of the model, and optimized on the validation set.


  • Abdollahi, B., & Nasraoui, O. (2017). Using explainability for constrained matrix factorization. In Proceedings of the eleventh ACM conference on recommender systems (pp. 79–83).

  • Abdollahpouri, H., Mansoury, M., Burke, R., & Mobasher, B. (2019). The unfairness of popularity bias in recommendation. In R. Burke, H. Abdollahpouri, E.C. Malthouse, K.P. Thai, & Y. Zhang (Eds.) Proceedings of the workshop on recommendation in multi-stakeholder environments co-located with the 13th ACM conference on recommender systems (RecSys 2019), Copenhagen, Denmark, September 20, 2019. CEUR Workshop Proceedings (Vol. 2440).

  • Adomavicius, G., & Zhang, J. (2012). Stability of recommendation algorithms. ACM Transactions on Information Systems, 30(4), 23:1–23:31.

    Article  Google Scholar 

  • Adomavicius, G., & Zhang, J. (2014). Improving stability of recommender systems: a meta-algorithmic approach. IEEE Transactions on Knowledge and Data Engineering, 27(6), 1573–1587.

    Article  Google Scholar 

  • Anand, D., & Bharadwaj, K.K. (2011). Utilizing various sparsity measures for enhancing accuracy of collaborative recommender systems based on local and global similarities. Expert Systems with Applications, 38(5), 5101–5109.,

    Article  Google Scholar 

  • Bar, A., Rokach, L., Shani, G., Shapira, B., & Schclar, A. (2013). Improving simple collaborative filtering models using ensemble methods. In Multiple classifier systems (pp. 1–12). Berlin: Springer.

  • Bell, R., Koren, Y., & Volinsky, C. (r2007). Modeling relationships at multiple scales to improve accuracy of large recommender systems. In Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07. (pp. 95–104). New York: ACM.

  • Bernardis, C., Ferrari Dacrema, M., & Cremonesi, P. (2019). Estimating confidence of individual user predictions in item-based recommender systems. In Proceedings of the 27th ACM conference on user modeling, adaptation and personalization, UMAP ’19. (pp. 149–156). New York: Association for Computing Machinery.

  • Bousquet, O., & Elisseeff, A. (2002). Stability and generalization. Journal of Machine Learning Research, 2, 499–526.

    MathSciNet  MATH  Google Scholar 

  • Cantador, I., Brusilovsky, P., & Kuflik, T. (2011). Second workshop on information heterogeneity and fusion in recommender systems (hetrec2011). In Proceedings of the fifth ACM conference on recommender systems (pp. 387–388). New York: ACM.

  • Channamsetty, S., & Ekstrand, M.D. (2017). Recommender response to diversity and popularity bias in user profiles. In V. Rus Z. Markov (Eds.) Proceedings of the thirtieth international Florida artificial intelligence research society conference, FLAIRS 2017, Marco Island, Florida, USA, May 22–24, 2017. (pp. 657–660). AAAI Press.

  • Cremonesi, P., Koren, Y., & Turrin, R. (2010). Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the fourth ACM conference on recommender systems, RecSys ’10. (pp. 39–46). New York: Association for Computing Machinery.

  • Dacrema, M.F., Cremonesi, P., & Jannach, D. (2019). Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In T. Bogers, A. Said, P. Brusilovsky, & P. Tikk (Eds.) Proceedings of the 13th ACM conference on recommender systems, RecSys 2019, Copenhagen, Denmark, September 16–20, 2019. (pp. 101–109). ACM.

  • Desrosiers, C., & Karypis, G. (2011). A comprehensive survey of neighborhood-based recommendation methods. In F. Ricci, L. Rokach, B. Shapira, & P.B. Kantor (Eds.) Recommender systems handbook (pp. 107–144). Springer..

  • Funk, S. (2006). Netflix update: try this at home. http://sifterorg/simon/journal/20061211html.

  • Gabbolini, G., D’Amico, E., Bernardis, C., & Cremonesi, P. (2021). On the instability of embeddings for recommender systems: the case of matrix factorization. In Proceedings of the 36th ACM/SIGAPp symposium on applied computing.

  • Gemulla, R., Nijkamp, E., Haas, P.J., & Sismanis, Y. (2011). Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 69–77).

  • Geng, X., Zhang, H., Bian, J., & Chua, T.S. (2015). Learning image and user features for recommendation in social networks. In Proceedings of the IEEE international conference on computer vision (pp. 4274–4282).

  • Gong, L., & Nandi, A.K. (2013). An enhanced initialization method for non-negative matrix factorization. In IEEE International workshop on machine learning for signal processing, MLSP 2013, Southampton, United Kingdom, September 22–25, 2013. (pp. 1–6). IEEE.

  • Guidotti, R., & Ruggieri, S. (2019). On the stability of interpretable models. In International joint conference on neural networks, IJCNN 2019 Budapest, Hungary, July 14–19, 2019. (pp. 1–8). IEEE.

  • Harper, F.M., & Konstan, J.A. (2016). The movielens datasets: history and context. ACM Transactions on Interactive Intelligent Systems, 5(4), 19:1–19:19.

    Article  Google Scholar 

  • He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T.S. (2017). Neural collaborative filtering. In Proceedings of the 26th international conference on World Wide Web (pp. 173–182).

  • Huang, C.B., & Gong, S.J. (2008). Employing rough set theory to alleviate the sparsity issue in recommender system. In 2008 International conference on machine learning and cybernetics (Vol. 3, pp. 1610–1614). IEEE.

  • Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J.E., & Weinberger, K.Q. (2017). Snapshot ensembles: Train 1, get M for free. In 5th International conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017. Conference Track Proceedings.

  • Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D.P., & Wilson, A.G. (2018). Averaging weights leads to wider optima and better generalization. In A. Globerson R. Silva (Eds.) Proceedings of the thirty-fourth conference on uncertainty in artificial intelligence, UAI 2018, Monterey, California, USA, August 6–10, 2018 (pp. 876–885). AUAI Press..

  • Jannach, D., Lerche, L., Gedikli, F., & Bonnin, G. In S. Carberry, S. Weibelzahl, A. Micarelli, & G. Semeraro (Eds.) (2013). What recommenders recommend—an analysis of accuracy, popularity, and sales diversity effects. Berlin: Springer.

  • Koren, Y. (2008). Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’08(pp. 426–434). New York: ACM..

  • Koren, Y. (2010). Factor in the neighbors: scalable and accurate collaborative filtering. ACM Transactions Knowledge Discovery from Data, 4(1), 1:1–1:24.

    Article  Google Scholar 

  • Koren, Y., Bell, R.M., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.

    Article  Google Scholar 

  • Li, D., Chen, C., Lv, Q., Yan, J., Shang, L., & Chu, S. (2016). Low-rank matrix approximation with stability. In M.F. Balcan K.Q. Weinberger (Eds.) Proceedings of the 33rd international conference on machine learning, PMLR, New York. Proceedings of machine learning research (Vol. 48, pp. 295–303).

  • Li, D., Miao, C., Chu, S., Mallen, J., Yoshioka, T., & Srivastava, P. (2018). Stable matrix approximation for top-n recommendation on implicit feedback data. In Proceedings of the 51st Hawaii international conference on system sciences.

  • Liang, D., Altosaar, J., Charlin, L., & Blei, D.M. (2016). Factorization meets the item embedding: regularizing matrix factorization with item co-occurrence. In Proceedings of the 10th ACM conference on recommender systems (pp. 59–66).

  • Luo, X., Zhou, M., Xia, Y., & Zhu, Q. (2014). An efficient non-negative matrix-factorization-based approach to collaborative filtering for recommender systems. IEEE Transactions on Industrial Informatics, 10(2), 1273–1284.

    Article  Google Scholar 

  • Madhyastha, P., & Jain, R. (2019). On model stability as a function of random seed. In M. Bansal A. Villavicencio (Eds.) Proceedings of the 23rd conference on computational natural language learning, CoNLL 2019, Hong Kong, China, November 3–4, 2019. (pp. 929–939). Association for Computational Linguistics.

  • Mobasher, B., Burke, R.D., Bhaumik, R., & Sandvig, J.J. (2007). Attacks and remedies in collaborative recommendation. IEEE Intelligent Systems, 22(3), 56–63.

    Article  Google Scholar 

  • Ning, X., & Karypis, G. (2011). Slim: sparse linear methods for top-n recommender systems. In 2011 IEEE 11th international conference on data mining (pp. 497–506).

  • Olaleke, O., Oseledets, I.V., & Frolov, E. (2021). Dynamic modeling of user preferences for stable recommendations. In J. Masthoff, E. Herder, N. Tintarev, & M. Tkalcic (Eds.) Proceedings of the 29th ACM conference on user modeling, adaptation and personalization, UMAP 2021, June, 21–25, 2021. (pp. 262–266). Utrecht: ACM.

  • Paterek, A. (2007). Improving regularized singular value decomposition for collaborative filtering. In Proceedings of KDD cup and workshop (Vol. 2007 pp. 5–8).

  • Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-Thieme, L. (2012). BPR: bayesian personalized ranking from implicit feedback. arXiv:1205.2618.

  • Said, A., & Bellogín, A. (2018). Coherence and inconsistencies in rating behavior: estimating the magic barrier of recommender systems. User Modeling and User-Adapted Interaction, 28(2), 97–125.

    Article  Google Scholar 

  • Said, A., Jain, B.J., Narr, S., & Plumbaum, T. (2012). Users and noise: the magic barrier of recommender systems. In International conference on user modeling, adaptation, and personalization (pp. 237–248). Springer.

  • Salakhutdinov, R., & Mnih, A. (2007). Probabilistic matrix factorization. In Proceedings of the 20th international conference on neural information processing systems, NIPS ’07. (pp. 1257–1264). Curran Associates Inc.

  • Shriver, D., Elbaum, S.G., Dwyer, M.B., & Rosenblum, D.S. (2019). Evaluating recommender system stability with influence-guided fuzzing. In The thirty-third AAAI conference on artificial intelligence, AAAI 2019, the thirty-first innovative applications of artificial intelligence conference, IAAI 2019, the ninth AAAI symposium on educational advances in artificial intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019. (pp. 4934–4942). AAAI Press.

  • Skurichina, M., & Duin, R.P.W. (1998). Bagging for linear classifiers. Pattern Recognition, 31 (7), 909–930.

    Article  Google Scholar 

  • Tintarev, N., & Masthoff, J. (2007). A survey of explanations in recommender systems. In 2007 IEEE 23rd international conference on data engineering workshop (pp. 801–810). IEEE.

  • Töscher, A., Jahrer, M., & Legenstein, R. (2008). Improved neighborhood-based algorithms for large-scale recommender systems. In Proceedings of the 2nd KDD workshop on large-scale recommender systems and the Netflix Prize Competition, NETFLIX ’08. (pp. 4:1–4:6). New York: ACM.

  • Wang, H., Chen, B., & Li, W. (2013). Collaborative topic regression with social regularization for tag recommendation. In F. Rossi (Ed.) IJCAI 2013, Proceedings of the 23rd international joint conference on artificial intelligence, Beijing, China, August 3–9, 2013, IJCAI/AAAI. (pp. 2719–2725).

  • Webber, W., Moffat, A., & Zobel, J. (2010). A similarity measure for indefinite rankings. ACM Transactions on Information Systems 28(4).

  • Xue, H.J., Dai, X., Zhang, J., Huang, S., & Chen, J. (2017). Deep matrix factorization models for recommender systems. In IJCAI, (Vol. 17 pp. 3203–3209). Melbourne.

  • Zhang, S., Yao, L., Sun, A., & Tay, Y. (2019). Deep learning based recommender system: a survey and new perspectives. ACM Computing Surveys (CSUR), 52(1), 1–38.

    Article  Google Scholar 

  • Zheng, Y. (2016). Adapt to emotional reactions in context-aware personalization. In EMPIRE@ Recsys (pp. 1–8).

  • Zheng, Z., Yang, J., & Zhu, Y. (2007). Initialization enhancer for non-negative matrix factorization. Engineering Applications of Artificial Intelligence, 20 (1), 101–110.

    Article  Google Scholar 

  • Ziegler, C.N., McNee, S.M., Konstan, J.A., & Lausen, G. (2005). Improving recommendation lists through topic diversification. In Proceedings of the 14th international conference on World Wide Web, WWW ’05. (pp. 22–32). New York: Association for Computing Machinery.

Download references


Edoardo D’Amico and Giovanni Gabbolini would like to acknowledge a grant from Science Foundation Ireland (SFI) under Grant Number 12/RC/2289-P2, which is co-funded under the European Regional Development Fund, that partially supported this research.


Edoardo D’Amico and Giovanni Gabbolini would like to acknowledge a grant from Science Foundation Ireland (SFI) under Grant Number 12/RC/2289-P2, which is co-funded under the European Regional Development Fund, that partially supported this research.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Edoardo D’Amico.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Availability of Data and Material

The datasets used in the experiments are publicly available in the online repository.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Edoardo D’Amico and Giovanni Gabbolini contributed equally to the work.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

D’Amico, E., Gabbolini, G., Bernardis, C. et al. Analyzing and improving stability of matrix factorization for recommender systems. J Intell Inf Syst 58, 255–285 (2022).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: