Skip to main content
Log in

Multimodal recommendation algorithm based on Dempster-Shafer evidence theory

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Multimodal features of items (e.g., text, audio, visual) have been proven to be effective in improving recommendations. Most of the existing personalized recommendation algorithms only use user-item interaction to recommend items for users, ignoring the rich auxiliary information implicit in the multimodal features of items. This auxiliary information can enrich the user’s description of the item and enhance the mining ability of the recommendation algorithm. However, as a result of users having different preferences for each modality, the multimodal recommendation becomes a challenging task. Therefore, this paper proposes a multimodal recommendation algorithm based on the Dempster-Shafer evidence theory. First, we use the similarity measure of user history interaction items and noninteraction items in different modalities to characterize user preferences for noninteraction items. Second, we use the discrete degree of different modal features of user history interaction items to represent the degree of user preference for different modalities. Third, according to the Dempster-Shafer evidence theory, users’ preferences for noninteractive items in different modalities are considered evidence. As a result, we fuse this evidence with different modal preferences to produce recommendations. Experimental results show that the proposed method is superior to the traditional personalized recommendation algorithm and the early multimodal fusion recommendation method. The proposed algorithm has good interpretability and expansibility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://grouplens.org/datasets/movielens/.

  2. http://ffmpeg.org/.

References

  1. Ambati LS, El-Gayar O (2021) Human activity recognition: a comparison of machine learning approaches. Journal of the Midwest Association for Information Systems (JMWAIS) 2021(1):49. https://doi.org/10.17705/3jmwa.000065

    Google Scholar 

  2. Arora S, Liang Y, Ma T (2017) A simple but tough-to-beat baseline for sentence embeddings. In: International conference on learning representations. https://openreview.net/forum?id=SyK00v5xx

  3. Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: a survey. Multimed Syst 16(6):345–379. https://doi.org/10.1007/s00530-010-0182-0

    Article  Google Scholar 

  4. Baltrusaitis T, Ahuja C, Morency L (2019) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443. https://doi.org/10.1109/TPAMI.2018.2798607

    Article  PubMed  Google Scholar 

  5. Bobadilla J, Ortega F, Hernando A, Gutiérrez A (2013) Recommender systems survey. Knowl-Based Syst 46:109–132. https://doi.org/10.1016/j.knosys.2013.03.012

    Article  Google Scholar 

  6. Chen J, Zhang H, He X, Nie L, Liu W, Chua TS (2017) Attentive collaborative filtering: multimedia recommendation with item-and component-level attention. In: Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, pp 335–344. https://doi.org/10.1145/3077136.3080797

  7. Cuzzocrea A, Fadda E (2020) Data-intensive object-oriented adaptive web systems: implementing and experimenting the oo-xahm framework. In: Proceedings of the 12th International conference on management of digital EcoSystems, pp 115–123. https://doi.org/10.1145/3415958.3433051

  8. D’mello SK, Kory J (2015) A review and meta-analysis of multimodal affect detection systems. ACM Computing Surveys (CSUR) 47(3):1–36. https://doi.org/10.1145/2682899

    Article  Google Scholar 

  9. Dempster AP (2008) Upper and lower probabilities induced by a multivalued mapping. In: Classic works of the Dempster-Shafer theory of belief functions, Springer, pp 57–72. https://doi.org/10.1007/978-3-540-44792-4_3

  10. Denoeux T (2019) Decision-making with belief functions: a review. Int J Approx Reason 109:87–110. https://doi.org/10.1016/j.ijar.2019.03.009

    Article  MathSciNet  Google Scholar 

  11. Denoeux T (2019) Logistic regression, neural networks and dempster–shafer theory: a new perspective. Knowl-Based Syst 176:54–67. https://doi.org/10.1016/j.knosys.2019.03.030

    Article  Google Scholar 

  12. Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS) 22(1):143–177. https://doi.org/10.1145/963770.963776

    Article  Google Scholar 

  13. El-Gayar OF, Ambati LS, Nawar N (2020) Wearables, artificial intelligence, and the future of healthcare. In: AI and big data’s potential for disruptive innovation, IGI Global, pp 104–129. https://doi.org/10.4018/978-1-5225-9687-5.ch005

  14. He X, Liao L, Zhang H, Nie L, Hu X, Chua TS (2017) Neural collaborative filtering. In: Proceedings of the 26th international conference on world wide web, pp 173–182. https://doi.org/10.1145/3038912.3052569

  15. He R, McAuley J (2016) Vbpr: visual bayesian personalized ranking from implicit feedback. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30. https://ojs.aaai.org/index.php/AAAI/article/view/9973

  16. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  17. Hershey S, Chaudhuri S, Ellis DP, Gemmeke JF, Jansen A, Moore RC, Plakal M, Platt D, Saurous RA, Seybold B et al (2017) Cnn architectures for large-scale audio classification. In: 2017 IEEE international conference on acoustics, speech and signal processing (icassp), IEEE, pp 131–135. https://doi.org/10.1109/ICASSP.2017.7952132

  18. Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: 2008 8th IEEE International Conference on Data Mining, IEEE, pp 263–272. https://doi.org/10.1109/ICDM.2008.22

  19. Huang Y, Du C, Xue Z, Chen X, Zhao H, Huang L (2021) What makes multimodal learning better than single (provably). arXiv:2106.04538

  20. Jin J, Xiao R, Daly I, Miao Y, Wang X, Cichocki A (2020) Internal feature selection method of csp based on l1-norm and dempster-shafer theory. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3015505

  21. Johnson CC (2014) Logistic matrix factorization for implicit feedback data. Adv Neural Inf Process Syst 27(78):1–9

    Google Scholar 

  22. Li W, Zhou X, Shimizu S, Xin M, Jiang J, Gao H, Jin Q (2019) Personalization recommendation algorithm based on trust correlation degree and matrix factorization. IEEE Access 7:45451–45459. https://doi.org/10.1109/ACCESS.2018.2885084

    Article  Google Scholar 

  23. Lika B, Kolomvatsos K, Hadjiefthymiades S (2014) Facing the cold start problem in recommender systems. Expert Syst Appl 41(4):2065–2073. https://doi.org/10.1016/j.eswa.2013.09.005

    Article  Google Scholar 

  24. Morency LP, Mihalcea R, Doshi P (2011) Towards multimodal sentiment analysis: harvesting opinions from the web. In: Proceedings of the 13th international conference on multimodal interfaces, pp 169–176. https://doi.org/10.1145/2070481.2070509

  25. Pan R, Zhou Y, Cao B, Liu NN, Lukose R, Scholz M, Yang Q (2008) One-class collaborative filtering. In: 2008 Eighth IEEE International conference on data mining, IEEE, pp 502–511. https://doi.org/10.1109/ICDM.2008.16

  26. Pazzani MJ, Billsus D (2007) Content-based recommendation systems. In: The adaptive web, Springer, pp 325–341. https://doi.org/10.1007/978-3-540-72079-9_10

  27. Poria S, Chaturvedi I, Cambria E, Hussain A (2016) Convolutional mkl based multimodal emotion recognition and sentiment analysis. In: 2016 IEEE 16th international conference on data mining (ICDM), IEEE, pp 439–448. https://doi.org/10.1109/ICDM.2016.0055

  28. Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2012) Bpr: Bayesian personalized ranking from implicit feedback. arXiv:1205.2618

  29. Sai Ambati L, El-Gayar OF, Nawar N (2020) Influence of the digital divide and socio-economic factors on prevalence of diabetes. https://doi.org/10.48009/4_iis_2020_103-113

  30. Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on World Wide Web, pp 285–295. https://doi.org/10.1145/371920.372071

  31. Shafer G (1976) A mathematical theory of evidence. Princeton university press

  32. Singh R, Vatsa M, Noore A, Singh SK (2006) Ds theory based fingerprint classifier fusion with update rule to minimize training time. IEICE Electron Expr 3(20):429–435. https://doi.org/10.1587/elex.3.429

    Article  Google Scholar 

  33. Smets P (2005) Decision making in the tbm: the necessity of the pignistic transformation. Int J Approx Reason 38(2):133–147. https://doi.org/10.1016/j.ijar.2004.05.003

    Article  MathSciNet  Google Scholar 

  34. Su Zg, Denoeux T (2018) Bpec: belief-peaks evidential clustering. IEEE Trans Fuzzy Syst 27(1):111–123. https://doi.org/10.1109/TFUZZ.2018.2869125

    Article  Google Scholar 

  35. Tao Y, Wang C, Yao L, Li W, Yu Y (2021) Item trend learning for sequential recommendation system using gated graph neural network. Neural Comput Applic, pp 1–16. https://doi.org/10.1007/s00521-021-05723-2

  36. Tao Z, Wei Y, Wang X, He X, Huang X, Chua TS (2020) Mgat: multimodal graph attention network for recommendation. Inf Process Manag 57(5):102277. https://doi.org/10.1016/j.ipm.2020.102277

    Article  Google Scholar 

  37. Tarus JK, Niu Z, Mustafa G (2018) Knowledge-based recommendation: a review of ontology-based recommender systems for e-learning. Artif Intell Rev 50(1):21–48. https://doi.org/10.1007/s10462-017-9539-5

    Article  Google Scholar 

  38. Wei Y, Wang X, Nie L, He X, Hong R, Chua TS (2019) Mmgcn: multi-modal graph convolution network for personalized recommendation of micro-video. In: Proceedings of the 27th ACM International conference on multimedia, pp 1437–1445. https://doi.org/10.1145/3343031.3351034

  39. Wu C, Wu F, Qi T, Huang Y (2021) Mm-rec: multimodal news recommendation arXiv:2104.07407

  40. Yu X, Jiang F, Du J, Gong D (2019) A cross-domain collaborative filtering algorithm with expanding user and item features via the latent factor space of auxiliary domains. Pattern Recogn 94:96–109. https://doi.org/10.1016/j.patcog.2019.05.030

    Article  ADS  Google Scholar 

  41. Zeng Z, Pantic M, Roisman GI, Huang TS (2008) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Anal Mach Intell 31(1):39–58. https://doi.org/10.1109/TPAMI.2008.52

    Article  Google Scholar 

  42. Zheng V, Cao B, Zheng Y, Xie X, Yang Q (2010) Collaborative filtering meets mobile recommendation: a user-centered approach. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 24. http://www.aaai.org/ocs/index.php/AAAI/AAAI10/paper/view/1615

  43. Zhou T, Thung KH, Liu M, Shi F, Zhang C, Shen D (2020) Multi-modal latent space inducing ensemble svm classifier for early dementia diagnosis with neuroimaging data. Med Image Anal 60:101630. https://doi.org/10.1016/j.media.2019.101630

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by the Science Fund for Outstanding Youth of Xinjiang Uygur Autonomous Region under Grant No. 2021D01E14, the National Science Foundation of China under Grant No. 61867006, The Major science and technology project of Xinjiang Uygur Autonomous Region under Grant No. 2020A03001 , the Key Laboratory Open Project of Science and Technology Department of Xinjiang Uygur Autonomous Region named Research on video information intelligent processing technology for Xinjiang regional security.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiwei Qin.

Ethics declarations

Conflict of Interests

The data used in the current study are from the MMGCN [38] publicly available dataset, which can be obtained from literature [38].

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Qin, J. Multimodal recommendation algorithm based on Dempster-Shafer evidence theory. Multimed Tools Appl 83, 28689–28704 (2024). https://doi.org/10.1007/s11042-023-15262-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15262-8

Keywords

Navigation