Abstract
With the increasing number of users express their emotions via images on social media, image emotion recognition attracts much attention of researchers. Different from conventional computer vision tasks, image emotion recognition is inherently more challenging for the ambiguity and subjectivity of emotion. Existing methods are limited to learn a direct mapping from image feature to emotion. However, emotion cognition mechanism in psychology demonstrates that human beings perceive emotion in a stepwise way. Therefore, we propose a novel image emotion recognition method that leverages emotional concepts as intermediary to bridge image and emotion. Specifically, we organize the relationship between concept and emotion in the form of knowledge graph. The relation between image and emotion is explored in the semantic embedding space where the knowledge is encoded into. Then, based on the hierarchical relation of emotions, we propose a multi-task learning deep model to recognize image emotion from visual perspective. Finally, a fusion strategy is proposed to merge the results of both visual-semantic stream and visual stream. Extensive experimental results show that our method outperforms state-of-the-art methods on two public image emotion datasets.
Similar content being viewed by others
References
Li, Z., Fan, Y., Liu, W., Wang, F.: Image sentiment prediction based on textual descriptions with adjective noun pairs. Multimedia Tools Appl. 77(1), 1115–1132 (2018)
Liu, X., Li, N., Xia, Y.: Affective image classification by jointly using interpretable art features and semantic annotations. J. Vis. Commun. Image Represent 58, 576–588 (2019)
Yang, J., She, D., Lai, Y.K., Yang, M.H.: Retrieving and classifying affective images via deep metric learning. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pp. 491–498 (2018)
Lin, H., Jia, J., Guo, Q., Xue Y., Huang, J., Cai, L., Feng, L.: Psychological stress detection from cross-media microblog data using deep sparse neural network. In: Proceedings of 2014 IEEE International Conference on Multimedia and Expo, pp. 1–6 (2014)
Dellandrea, E., Liu, N., Chen, L.: Classification of affective semantics in images based on discrete and dimensional models of emotions. In: Proceedings of 2010 International Workshop on Content Based Multimedia Indexing, pp. 1–6 (2010)
Lu, X., Suryanarayan, P., Adams Jr., R.B., Li, J., Newman, M.G., Wang, J.Z.: On shape and computability of emotions. In: Proceedings of ACM International Conference on Multimedia, pp. 229–238 (2012)
Zhao, S., Gao, Y., Jiang, X., Yao, H., Chua, T.-S., Sun, X.: Exploring principles-of-art features for image emotion recognition. In: Proceedings of the 22nd ACM international conference on Multimedia, pp. 47–56 (2014)
Yang, J., She, D., Sun, M.: Joint image emotion classification and distribution learning via deep convolutional neural network. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3266–3272 (2017)
He, X., Zhang, W.: Emotion recognition by assisted learning with convolutional neural networks. Neurocomputing 291, 187–194 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Yang, J., She, D., Sun, M., Cheng, M., Rosin, P.L., Wang, L.: Visual sentiment prediction based on automatic discovery of affective regions. IEEE Trans. Multimedia 20, 2513–2525 (2018)
She, D., Yang, J., Cheng, M., Lai, Y., Rosin, P.L., Wang, L.: WSCNet: Weakly supervised coupled networks for visual sentiment classification and detection. IEEE Trans. Multimedia 22, 1358–1371 (2020)
Song, K., Yao, T., Ling, Q., Mei, T.: Boosting image sentiment analysis with visual attention. Neurocomputing 312, 218–228 (2018)
Rao, T., Li, X., Xu, M.: Learning multi-level deep representations for image emotion classification. Neural Process. Lett. 51, 2043–2061 (2020)
Lim, L., Khor, H.Q., Chaemchoy, P., See, J., Wong, L.K.: Where is the emotion? Dissecting a multi-gap network for image emotion classification. In: Proceedings of 2020 IEEE International Conference on Image Processing (ICIP), pp. 1886–1890 (2020)
Mehrabian, A., Russell, J.A.: An Approach to Environmental Psychology. MIT Press, Cambridge (1974)
Goi, M., Kalidas, V., Yunus, N.: Mediating roles of emotion and experience in the stimulus-organism-response framework in higher education institutions. J. Mark. High. Educ. 28(1), 90–112 (2018)
Chen, T., Borth, D., Darrell, T., Chang, S.-F.: Deepsentibank: visual sentiment concept classification with deep convolutional neural networks. Comput. Sci. (2014)
Yanulevskaya, V., Gemert, J.V., Roth, K., Herbold, A.K., Sebe, N., Geusebroek, J.M.: Emotional valence categorization using holistic image features. In: Proceedings of the 15th IEEE International Conference on Image Processing, pp. 101–104 (2008)
Rao, T., Xu, M., Liu, H., Wang, J., Burnett, I.: Multi-scale blocks based image emotion classification using multiple instance learning. In: Proceedings of 2016 IEEE International Conference on Image Processing (ICIP), pp. 634–638 (2016)
Sartori, A., Culibrk D., Yan, Y., Sebe, N.: Who's afraid of itten: Using the art theory of color combination to analyze emotions in abstract paintings. In: Proceedings of the 23rd ACM international conference on Multimedia, pp. 311–320 (2015)
You, Q., Luo, J., Jin, H., Yang, J.: Robust image sentiment analysis using progressively trained and domain transferred deep networks. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 38–388 (2015)
Campos, V., Jou, B., Giró-i-Nieto, X.: From pixels to sentiment: fine-tuning CNNs for visual sentiment prediction. Image Vis. Comput. 65, 15–22 (2017)
Ali, A.R., Shahid, U., Ali, M., Ho, J.: High-level concepts for affective understanding of images. In: Proceedings of 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 679–687 (2017)
Zhang, J., Chen, M., Sun, H., Li, D., Wang, Z.: Object semantics sentiment correlation analysis enhanced image sentiment classification. Knowl. Based Syst. 191, 105245 (2020)
Oliveira, W.B., Dorini, L.B., Minetto, R., Silva, T.H.: OutdoorSent: Sentiment analysis of urban outdoor images by using semantic and deep features. ACM Trans. Inf. Syst. 23, 1–28 (2020)
Lin, L., Liang,L., Jin, L., Chen, W.: Attribute-aware convolutional neural networks for facial beauty prediction. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 847–853 (2019)
Ortis, A., Farinella, G.M., Torrisi, G., Battiato, S.: Exploiting objective text description of images for visual sentiment analysis. Multimed. Tools Appl. 80, 22323–22346 (2021)
Akata, Z., Reed, S., Walter, D., Honglak, L., Schiele,B.: Evaluation of output embeddings for fine-grained image classification. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2927–2936 (2015)
Huang, Y., Wu, Q., Song, W.C., Wang, L.: Learning semantic concepts and order for image and sentence matching. IEEE Trans. Pattern Anal. Mach. Intell. 42(3) (2017)
Wu, H., Mao, J., Zhang, Y., Jiang, Y., Li, L., Sun, W., Ma, W.Y.: UniVSE: robust visual semantic embeddings via structured semantic representations. IEEE (2019)
Caruana, R.: Multitask learning. Mach. Learn. 28, 41–75 (1997)
Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. In: Proceedings of the 20th Annual Conference on Neural Information Processing Systems, pp. 1–8 (2006)
Kao, Y., Huang, K., Maybank, S.: Hierarchical aesthetic quality assessment using deep convolutional neural networks. Signal Process. Image Commun. 47, 500–510 (2016)
Li, L., Zhu, H., Zhao, S., Ding, G., Jiang, H., Tan, A.: Personality driven multi-task learning for image aesthetic assessment. In: Proceedings of 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 430–435 (2019)
Tu, G., Fu, Y., Li, B., Gao, J., Jiang, Y., Xue, X.: A multi-task neural approach for emotion attribution, classification, and summarization. IEEE Trans. Multimedia. 22, 148–159 (2020)
D. Jia, D. Wei, R. Socher, LJ. Li, L. Kai, FF. Li, ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Borth, D., Ji, R., Chen, T., Breuel, T., Chang, S.F.: Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM international conference on Multimedia,pp. 223–232 (2013)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
Hoffer, E., Ailon, N.: Deep metric learning using triplet network. In: Proceedings of 2015 International Workshop on Similarity-Based Pattern Recognition, pp. 84–92 (2015)
Mikels, J.A., Fredrickson, B.L., Larkin, G.R., Lindberg, C.M., Maglio, S.J., Reuter-Lorenz, P.A.: Emotional category data on images from the international affective picture system. Behav. Res. Methods 37, 626–630 (2005)
Yang, L., Tang, K., Yang, J., Li, L.: Dense captioning with joint inference and visual context. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1978–1987 (2017)
Wang, P., Li, W., Li, C., Hou, Y.: Action recognition based on joint trajectory maps with convolutional neural networks. Knowl. Based Syst. 158, 43–53 (2018)
You, Q., Luo, J., Jin, H., Yang, J.: Building a large scale dataset for image emotion recognition: the fine print and the benchmark. In: Proceedings of the AAAI conference on artificial intelligence. Vol. 30, No. 1 (2016).
Peng, K., Chen, T., Sadovnik, A., Gallagher, A.: A mixed bag of emotions: model, predict, and transfer emotion distributions. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 860–868 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Yamamoto, T., Takeuchi, S., Nakazawa, A.: Image emotion recognition using visual and semantic features reflecting emotional and similar objects. IEICE Trans. Inf. Syst. 104(10), 1691–1701 (2021)
Xiong, H., Liu, H., Zhong, B., Fu, Y.: Structured and sparse annotations for image emotion distribution learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 363–370 (2019)
Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 83–92 (2010)
Funding
This work was supported in part by the National Natural Science Foundation of China under Grant 62071384.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, H., Fan, Y., Lv, G. et al. Exploiting emotional concepts for image emotion recognition. Vis Comput 39, 2177–2190 (2023). https://doi.org/10.1007/s00371-022-02472-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02472-8