SECT: Sentiment-Enriched Continual Training for Image Sentiment Analysis

Wu, Lifang; Xing, Lehao; Shi, Ge; Deng, Sinuo; Yang, Jie

doi:10.1007/978-3-031-46305-1_8

Lifang Wu¹⁴,
Lehao Xing¹⁴,
Ge Shi¹⁴,
Sinuo Deng¹⁴ &
…
Jie Yang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14355))

Included in the following conference series:

International Conference on Image and Graphics

414 Accesses

Abstract

In recent times, pre-training models of a large scale have achieved notable success in various downstream tasks by relying on contrastive image-text pairs to learn high-quality visual general representations from natural language supervision. However, these models typically disregard sentiment knowledge during the pre-training phase, subsequently hindering their capacity for optimal image sentiment analysis. To address these challenges, we propose a sentiment-enriched continual training framework (SECT), which continually trains CLIP and introduces multi-level sentiment knowledge in the further pre-training process through the use of sentiment-based natural language supervision. Moreover, we construct a large-scale weakly annotated sentiment image-text dataset to ensure that the model is trained robustly. In addition, SECT conducts three training objectives that effectively integrate multi-level sentiment knowledge into the model training process. Our experiments on various datasets, namely EmotionROI, FI, and Twitter I, demonstrate that our SECT method provides a pre-training model that outperforms previous models and CLIP on most of the downstream datasets. Our codes will be publicly available for research purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010) (2010)
Google Scholar
Barbieri, F., Camacho-Collados, J., Anke, L.E., Neves, L.: TweetEval: unified benchmark and comparative evaluation for tweet classification. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1644–1650 (2020)
Google Scholar
Changpinyo, S., Sharma, P., Ding, N., Soricut, R.: Conceptual 12m: pushing web-scale image-text pre-training to recognize long-tail visual concepts. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3557–3567. IEEE (2021)
Google Scholar
Deng, S., Wu, L., Shi, G., Xing, L., Jian, M.: Learning to compose diversified prompts for image emotion classification. arXiv preprint arXiv:2201.10963 (2022)
Deng, S., Wu, L., Shi, G., Zhang, H., Hu, W., Dong, R.: Emotion class-wise aware loss for image emotion classification. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds.) Artificial Intelligence. CICAI 2021. LNCS, vol. 13069, pp. 553–564. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93046-2_47
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2020)
Google Scholar
Li, G., Duan, N., Fang, Y., Gong, M., Jiang, D.: Unicoder-vl: a universal encoder for vision and language by cross-modal pre-training. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11336–11344 (2020)
Google Scholar
Li, M., et al.: Clip-event: connecting text and images with event structures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16420–16429 (2022)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2018)
Google Scholar
Peng, K.C., Sadovnik, A., Gallagher, A., Chen, T.: Where do emotions come from? predicting the emotion stimuli map. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 614–618. IEEE (2016)
Google Scholar
Qi, D., Su, L., Song, J., Cui, E., Bharti, T., Sacheti, A.: ImageBERT: cross-modal pre-training with large-scale weak-supervised image-text data. arXiv preprint arXiv:2001.07966 (2020)
Rao, T., Li, X., Zhang, H., Xu, M.: Multi-level region-based convolutional neural network for image emotion classification. Neurocomputing 333, 429–439 (2019)
Article Google Scholar
She, D., Yang, J., Cheng, M.M., Lai, Y.K., Rosin, P.L., Wang, L.: WSCNet: weakly supervised coupled networks for visual sentiment classification and detection. IEEE Trans. Multimedia 22(5), 1358–1371 (2019)
Article Google Scholar
Tian, H., et al.: Skep: sentiment knowledge enhanced pre-training for sentiment analysis. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4067–4076 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wu, L., Zhang, H., Deng, S., Shi, G., Liu, X.: Discovering sentimental interaction via graph convolutional network for visual sentiment prediction. Appl. Sci. 11(4), 1404 (2021)
Article Google Scholar
You, Q., Luo, J., Jin, H., Yang, J.: Robust image sentiment analysis using progressively trained and domain transferred deep networks. In: Twenty-ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
You, Q., Luo, J., Jin, H., Yang, J.: Building a large scale dataset for image emotion recognition: the fine print and the benchmark. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)
Google Scholar
Zhang, H., Xu, M.: Weakly supervised emotion intensity prediction for recognition of emotions in images. IEEE Trans. Multimedia 23, 2033–2044 (2020)
Article Google Scholar
Zhao, S., et al.: Affective image content analysis: two decades review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 44(10), 6729–6751 (2021)
Article Google Scholar

Download references

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under Grant NO. 62236010, 61976010, 62106011, 62106010, 62176011.

Author information

Authors and Affiliations

Beijing University of Technology, No. 100 Pingleyuan, Chaoyang District, Beijing, 100124, China
Lifang Wu, Lehao Xing, Ge Shi, Sinuo Deng & Jie Yang

Authors

Lifang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lehao Xing
View author publications
You can also search for this author in PubMed Google Scholar
Ge Shi
View author publications
You can also search for this author in PubMed Google Scholar
Sinuo Deng
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ge Shi .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Huchuan Lu
University of Sydney, Sydney, NSW, Australia
Wanli Ouyang
Shenzhen University, Shenzhen, China
Hui Huang
Tsinghua University, Beijing, China
Jiwen Lu
Dalian University of Technology, Dalian, China
Risheng Liu
Institute of Automation, CAS, Beijing, China
Jing Dong
University of Technology Sydney, Sydney, NSW, Australia
Min Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, L., Xing, L., Shi, G., Deng, S., Yang, J. (2023). SECT: Sentiment-Enriched Continual Training for Image Sentiment Analysis. In: Lu, H., et al. Image and Graphics. ICIG 2023. Lecture Notes in Computer Science, vol 14355. Springer, Cham. https://doi.org/10.1007/978-3-031-46305-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-46305-1_8
Published: 29 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46304-4
Online ISBN: 978-3-031-46305-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

SECT: Sentiment-Enriched Continual Training for Image Sentiment Analysis