Abstract
Multimodal features of items (e.g., text, audio, visual) have been proven to be effective in improving recommendations. Most of the existing personalized recommendation algorithms only use user-item interaction to recommend items for users, ignoring the rich auxiliary information implicit in the multimodal features of items. This auxiliary information can enrich the user’s description of the item and enhance the mining ability of the recommendation algorithm. However, as a result of users having different preferences for each modality, the multimodal recommendation becomes a challenging task. Therefore, this paper proposes a multimodal recommendation algorithm based on the Dempster-Shafer evidence theory. First, we use the similarity measure of user history interaction items and noninteraction items in different modalities to characterize user preferences for noninteraction items. Second, we use the discrete degree of different modal features of user history interaction items to represent the degree of user preference for different modalities. Third, according to the Dempster-Shafer evidence theory, users’ preferences for noninteractive items in different modalities are considered evidence. As a result, we fuse this evidence with different modal preferences to produce recommendations. Experimental results show that the proposed method is superior to the traditional personalized recommendation algorithm and the early multimodal fusion recommendation method. The proposed algorithm has good interpretability and expansibility.
Similar content being viewed by others
References
Ambati LS, El-Gayar O (2021) Human activity recognition: a comparison of machine learning approaches. Journal of the Midwest Association for Information Systems (JMWAIS) 2021(1):49. https://doi.org/10.17705/3jmwa.000065
Arora S, Liang Y, Ma T (2017) A simple but tough-to-beat baseline for sentence embeddings. In: International conference on learning representations. https://openreview.net/forum?id=SyK00v5xx
Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: a survey. Multimed Syst 16(6):345–379. https://doi.org/10.1007/s00530-010-0182-0
Baltrusaitis T, Ahuja C, Morency L (2019) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443. https://doi.org/10.1109/TPAMI.2018.2798607
Bobadilla J, Ortega F, Hernando A, Gutiérrez A (2013) Recommender systems survey. Knowl-Based Syst 46:109–132. https://doi.org/10.1016/j.knosys.2013.03.012
Chen J, Zhang H, He X, Nie L, Liu W, Chua TS (2017) Attentive collaborative filtering: multimedia recommendation with item-and component-level attention. In: Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, pp 335–344. https://doi.org/10.1145/3077136.3080797
Cuzzocrea A, Fadda E (2020) Data-intensive object-oriented adaptive web systems: implementing and experimenting the oo-xahm framework. In: Proceedings of the 12th International conference on management of digital EcoSystems, pp 115–123. https://doi.org/10.1145/3415958.3433051
D’mello SK, Kory J (2015) A review and meta-analysis of multimodal affect detection systems. ACM Computing Surveys (CSUR) 47(3):1–36. https://doi.org/10.1145/2682899
Dempster AP (2008) Upper and lower probabilities induced by a multivalued mapping. In: Classic works of the Dempster-Shafer theory of belief functions, Springer, pp 57–72. https://doi.org/10.1007/978-3-540-44792-4_3
Denoeux T (2019) Decision-making with belief functions: a review. Int J Approx Reason 109:87–110. https://doi.org/10.1016/j.ijar.2019.03.009
Denoeux T (2019) Logistic regression, neural networks and dempster–shafer theory: a new perspective. Knowl-Based Syst 176:54–67. https://doi.org/10.1016/j.knosys.2019.03.030
Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS) 22(1):143–177. https://doi.org/10.1145/963770.963776
El-Gayar OF, Ambati LS, Nawar N (2020) Wearables, artificial intelligence, and the future of healthcare. In: AI and big data’s potential for disruptive innovation, IGI Global, pp 104–129. https://doi.org/10.4018/978-1-5225-9687-5.ch005
He X, Liao L, Zhang H, Nie L, Hu X, Chua TS (2017) Neural collaborative filtering. In: Proceedings of the 26th international conference on world wide web, pp 173–182. https://doi.org/10.1145/3038912.3052569
He R, McAuley J (2016) Vbpr: visual bayesian personalized ranking from implicit feedback. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30. https://ojs.aaai.org/index.php/AAAI/article/view/9973
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Hershey S, Chaudhuri S, Ellis DP, Gemmeke JF, Jansen A, Moore RC, Plakal M, Platt D, Saurous RA, Seybold B et al (2017) Cnn architectures for large-scale audio classification. In: 2017 IEEE international conference on acoustics, speech and signal processing (icassp), IEEE, pp 131–135. https://doi.org/10.1109/ICASSP.2017.7952132
Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: 2008 8th IEEE International Conference on Data Mining, IEEE, pp 263–272. https://doi.org/10.1109/ICDM.2008.22
Huang Y, Du C, Xue Z, Chen X, Zhao H, Huang L (2021) What makes multimodal learning better than single (provably). arXiv:2106.04538
Jin J, Xiao R, Daly I, Miao Y, Wang X, Cichocki A (2020) Internal feature selection method of csp based on l1-norm and dempster-shafer theory. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3015505
Johnson CC (2014) Logistic matrix factorization for implicit feedback data. Adv Neural Inf Process Syst 27(78):1–9
Li W, Zhou X, Shimizu S, Xin M, Jiang J, Gao H, Jin Q (2019) Personalization recommendation algorithm based on trust correlation degree and matrix factorization. IEEE Access 7:45451–45459. https://doi.org/10.1109/ACCESS.2018.2885084
Lika B, Kolomvatsos K, Hadjiefthymiades S (2014) Facing the cold start problem in recommender systems. Expert Syst Appl 41(4):2065–2073. https://doi.org/10.1016/j.eswa.2013.09.005
Morency LP, Mihalcea R, Doshi P (2011) Towards multimodal sentiment analysis: harvesting opinions from the web. In: Proceedings of the 13th international conference on multimodal interfaces, pp 169–176. https://doi.org/10.1145/2070481.2070509
Pan R, Zhou Y, Cao B, Liu NN, Lukose R, Scholz M, Yang Q (2008) One-class collaborative filtering. In: 2008 Eighth IEEE International conference on data mining, IEEE, pp 502–511. https://doi.org/10.1109/ICDM.2008.16
Pazzani MJ, Billsus D (2007) Content-based recommendation systems. In: The adaptive web, Springer, pp 325–341. https://doi.org/10.1007/978-3-540-72079-9_10
Poria S, Chaturvedi I, Cambria E, Hussain A (2016) Convolutional mkl based multimodal emotion recognition and sentiment analysis. In: 2016 IEEE 16th international conference on data mining (ICDM), IEEE, pp 439–448. https://doi.org/10.1109/ICDM.2016.0055
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2012) Bpr: Bayesian personalized ranking from implicit feedback. arXiv:1205.2618
Sai Ambati L, El-Gayar OF, Nawar N (2020) Influence of the digital divide and socio-economic factors on prevalence of diabetes. https://doi.org/10.48009/4_iis_2020_103-113
Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on World Wide Web, pp 285–295. https://doi.org/10.1145/371920.372071
Shafer G (1976) A mathematical theory of evidence. Princeton university press
Singh R, Vatsa M, Noore A, Singh SK (2006) Ds theory based fingerprint classifier fusion with update rule to minimize training time. IEICE Electron Expr 3(20):429–435. https://doi.org/10.1587/elex.3.429
Smets P (2005) Decision making in the tbm: the necessity of the pignistic transformation. Int J Approx Reason 38(2):133–147. https://doi.org/10.1016/j.ijar.2004.05.003
Su Zg, Denoeux T (2018) Bpec: belief-peaks evidential clustering. IEEE Trans Fuzzy Syst 27(1):111–123. https://doi.org/10.1109/TFUZZ.2018.2869125
Tao Y, Wang C, Yao L, Li W, Yu Y (2021) Item trend learning for sequential recommendation system using gated graph neural network. Neural Comput Applic, pp 1–16. https://doi.org/10.1007/s00521-021-05723-2
Tao Z, Wei Y, Wang X, He X, Huang X, Chua TS (2020) Mgat: multimodal graph attention network for recommendation. Inf Process Manag 57(5):102277. https://doi.org/10.1016/j.ipm.2020.102277
Tarus JK, Niu Z, Mustafa G (2018) Knowledge-based recommendation: a review of ontology-based recommender systems for e-learning. Artif Intell Rev 50(1):21–48. https://doi.org/10.1007/s10462-017-9539-5
Wei Y, Wang X, Nie L, He X, Hong R, Chua TS (2019) Mmgcn: multi-modal graph convolution network for personalized recommendation of micro-video. In: Proceedings of the 27th ACM International conference on multimedia, pp 1437–1445. https://doi.org/10.1145/3343031.3351034
Wu C, Wu F, Qi T, Huang Y (2021) Mm-rec: multimodal news recommendation arXiv:2104.07407
Yu X, Jiang F, Du J, Gong D (2019) A cross-domain collaborative filtering algorithm with expanding user and item features via the latent factor space of auxiliary domains. Pattern Recogn 94:96–109. https://doi.org/10.1016/j.patcog.2019.05.030
Zeng Z, Pantic M, Roisman GI, Huang TS (2008) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Anal Mach Intell 31(1):39–58. https://doi.org/10.1109/TPAMI.2008.52
Zheng V, Cao B, Zheng Y, Xie X, Yang Q (2010) Collaborative filtering meets mobile recommendation: a user-centered approach. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 24. http://www.aaai.org/ocs/index.php/AAAI/AAAI10/paper/view/1615
Zhou T, Thung KH, Liu M, Shi F, Zhang C, Shen D (2020) Multi-modal latent space inducing ensemble svm classifier for early dementia diagnosis with neuroimaging data. Med Image Anal 60:101630. https://doi.org/10.1016/j.media.2019.101630
Acknowledgements
This work was supported by the Science Fund for Outstanding Youth of Xinjiang Uygur Autonomous Region under Grant No. 2021D01E14, the National Science Foundation of China under Grant No. 61867006, The Major science and technology project of Xinjiang Uygur Autonomous Region under Grant No. 2020A03001 , the Key Laboratory Open Project of Science and Technology Department of Xinjiang Uygur Autonomous Region named Research on video information intelligent processing technology for Xinjiang regional security.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, X., Qin, J. Multimodal recommendation algorithm based on Dempster-Shafer evidence theory. Multimed Tools Appl 83, 28689–28704 (2024). https://doi.org/10.1007/s11042-023-15262-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15262-8