In big data scenarios, matrix factorization (MF) is widely used in recommendation systems as it can offer high accuracy and scalability. However, when using MF to process large-scale implicit feedback data, the following two problems arise. One is that it is difficult to effectively obtain negative feedback information, which causes relatively poor recommendation accuracy. The other is that the limited resources of a single machine make the model training inefficient, and in particular, the acquisition of negative feedback information further increases the time complexity of model training. In order to solve the above two problems, we first propose a user-activity and item-popularity weighted matrix factorization (UIWMF) recommendation algorithm, which assigns every missing data different weight based on user activity and item popularity, gets negative feedback information more realistically, and leads to better recommendation accuracy. Meanwhile, in order to reduce the additional computational overhead caused by the weight strategy, we develop a fast optimization strategy to enhance the efficiency. In order to break the resource constraints of a single machine, we propose a distributed UIWMF (DUIWMF) algorithm based on Spark, which adopts an efficient parallel learning algorithm to train the model and utilizes cached in-block and out-block information to effectively reduce the communication overhead in a distributed environment. We conduct experiments on three public datasets, and the experimental results demonstrate that, comparing with the baseline MF methods, DUIWMF model has comparable performance in terms of recommendation accuracy and model training efficiency.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Apache Spark. (2019a). website. http://spark-project.org.
Apache Spark. (2019b). mllib-als. http://spark.apache.org/docs/latest/.
Apache Flink. (2019). website. http://flink.apache.org.
Chen, C., Liu, Z., Zhao, P., Li, L., Zhou, J., & Li, X. (2018). Distributed collaborative hashing and its applications in ant financial. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge discovery and data mining (pp. 100–109): ACM.
Devooght, R., Kourtellis, N., & Mantrach, A. (2015). Dynamic matrix factorization with priors on unknown values. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 189–198): ACM.
Gemulla, R., Nijkamp, E., Haas, P. J., & Sismanis, Y. (2011). Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 69–77): ACM.
He, X., Zhang, H., Kan, M.-Y., & Chua, T.-S. (2016). Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (pp. 549–558): ACM.
He, Y., Wang, C., & Jiang, C. (2018). Correlated matrix factorization for recommendation with implicit feedback. IEEE Transactions on Knowledge and Data Engineering, 31(3), 451–464.
He, X., Tang, J., Du, X., Hong, R., Ren, T., & Chua, T.-S. (2019). Fast matrix factorization with nonuniform weights on missing data. IEEE transactions on neural networks and learning systems.
Hu, Y., Koren, Y., & Volinsky, C. (2008). Collaborative filtering for implicit feedback datasets. In ICDM, (Vol. 8 pp. 263–272): Citeseer.
Kabbur, S., Ning, X., & Karypis, G. (2013). Fism: factored item similarity models for top-n recommender systems. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 659–667): ACM.
Karydi, E., & Margaritis, K. (2016). Parallel and distributed collaborative filtering: a survey. ACM Computing Surveys (CSUR), 49(2), 37.
Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer (8), 30–37.
Li, H., Diao, X., Cao, J., & Zheng, Q. (2018). Collaborative filtering recommendation based on all-weighted matrix factorization and fast optimization. IEEE Access, 6, 25248–25260.
Liang, D., Charlin, L., McInerney, J., & Blei, D. M. (2016). Modeling user exposure in recommendation. In Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee (pp. 951–961).
Marlin, B., Zemel, R. S., Roweis, S., & Slaney, M. (2012). Collaborative filtering and the missing at random assumption, arXiv:1206.5267.
Ning, X., & Karypis, G. (2011). Slim: Sparse linear methods for top-n recommender systems. In 2011 IEEE 11th International Conference on Data Mining (pp. 497–506): IEEE.
Pan, R., Zhou, Y., Cao, B., Liu, N.N., Lukose, R., Scholz, M., & Yang, Q. (2008). One-class collaborative filtering. In 2008 Eighth IEEE International Conference on Data Mining (pp. 502–511): IEEE.
Pilászy, I., Zibriczky, D., & Tikk, D. (2010). Fast als-based matrix factorization for explicit and implicit feedback datasets. In Proceedings of the fourth ACM conference on Recommender systems (pp. 71–78): ACM.
Rendle, S., & Schmidt-Thieme, L. (2008). Online-updating regularized kernel matrix factorization models for large-scale recommender systems. In Proceedings of the 2008 ACM conference on Recommender systems (pp. 251–258): ACM.
Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-Thieme, L. (2009). Bpr: Bayesian personalized ranking from implicit feedback. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence (pp. 452–461): AUAI Press.
Schelter, S., Boden, C., Schenck, M., Alexandrov, A., & Markl, V. (2013). Distributed matrix factorization with mapreduce using a series of broadcast-joins. In Proceedings of the 7th ACM Conference on Recommender Systems (pp. 281–284): ACM.
Steck, H. (2010). Training and testing of recommender systems on data missing not at random. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 713–722): ACM.
Volkovs, M., & Yu, G. W. (2015). Effective latent models for binary feedback in recommender systems. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval (pp. 313–322): ACM.
Yu, H. -F., Hsieh, C. -J., Si, S., & Dhillon, I. S. (2014). Parallel matrix factorization for recommender systems. Knowledge and Information Systems, 41(3), 793–819.
Yun, H., Yu, H. -F., Hsieh, C. -J., Vishwanathan, S., & Dhillon, I. (2014). Nomad: Non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. Proceedings of the VLDB Endowment, 7(11), 975–986.
Zhu, Z., Liu, T., Jin, S., & Luo, X. (2018). Learn and pick right nodes to offload. In Proceedings of IEEE GLOBECOM (pp. 1–6).
The authors deeply appreciate the anonymous reviewers for their comments on the manuscript. The research was partially funded by the National Key R&D Program of China (Grant Nos. 2018YFB1003401), the National Natural Science Foundation of China (Grant Nos. 61872127, 61572175, 61751204, 61472124), the National Outstanding Youth Science Program of National Natural Science Foundation of China (Grant No. 61625202), the Key Program of National Natural Science Foundation of China (Grant No. 61432005), and the Natural Science Foundation of Hunan Province, China(Grant No. 2018JJ2022)
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Chen, L., Yang, W., Li, K. et al. Distributed matrix factorization based on fast optimization for implicit feedback recommendation. J Intell Inf Syst (2020). https://doi.org/10.1007/s10844-020-00601-0
- Personalized recommendation
- Collaborative filtering
- User and item recommendation
- Fast optimization
- Distributed matrix factorization