Abstract
An image is worth a thousand of words for sentiment expression, but the semantic gap between low-level pixels and high-level sentiment make visual sentiment analysis difficult. Our work focuses on two aspects to bridge the gap: (1) High-level abstract feature learning for visual sentiment content. (2) Utilizing large-scale unlabeled dataset. We propose a hierarchical structure for automatic discovery of visual sentiment features—we called SentiNet which employed a ConvNet structure. In order to deal with the limitation of labeled data, we leverage the sentiment related signal to pre-annotate unlabeled samples from different source domains. In particular, we propose a hierarchy-stack fine-tune strategy to train SentiNet. We show how this pipeline can be applied on social media visual sentiment analysis. Our experiments on real-world covering half-million unlabeled images and two thousands labeled images show that our method defeats state-of-the-art visual methods, and prove the importance of large scale data and hierarchical architecture for visual sentiment analysis.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the International Conference on Multimedia, pp. 83–92. ACM (2010)
Li, B., Xiong, W., Hu, W., Ding, X.: Context-aware affective images classification based on bilayer sparse representation. In: Proceedings of the 20th ACM International Conference on Multimedia, pp. 721–724. ACM (2012)
Lee, P., Teng, Y., Hsiao, T.-C.: XCSF for prediction on emotion induced by image based on dimensional theory of emotion. In: Proceedings of the Fourteenth International Conference on Genetic and Evolutionary Computation Conference Companion, pp. 375–382. ACM (2012)
Borth, D., Ji, R., Chen, T., Breuel, T., Chang, S.-F..: Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 223–232. ACM (2013)
Wang, M., Cao, D., Li, L., Li, S., Ji, R.: Microblog sentiment analysis based on cross-media Bag-of-words model. In: Proceedings of International Conference on Internet Multimedia Computing and Service, pp. 76. ACM (2012)
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. In: Neural Computation, vol.1 no. 4, pp. 541–551 (1989)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. Pattern Anal. Mach. Intell. IEEE Trans. 35(1), 221–231 (2013)
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 1717–1724. IEEE (2014)
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31st International Conference on Machine Learning, pp. 647–655 (2014)
You, Q., Luo, J.: Towards social imagematics: sentiment analysis in social multimedia. In: Proceedings of the Thirteenth International Workshop on Multimedia Data Mining, pp. 3. ACM (2013)
Plutchik, R.: Emotion: a psychoevolutionary synthesis. Harpercollins College Division (1980)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding (2014). arXiv preprint arXiv:1408.5093
Acknowledgments
This work is supported by the Nature Science Foundation of China (No. 61422210, No. 61373076, No. 61402386 and No. 61305061 and No.61572409), Specialized Research Fund for the Doctoral Program of Higher Education (Grant No. 201101211 120024), Special funds for the development of strategic emerging industries of Shenzhen, China (Grant No. JCYJ20120614164600201), the Fundamental Research Funds for the Central Universities (No. 2013121026), the Natural Science Foundation of Fujian Province, China (Grant No. 2014J01249) and the Special Fund for Earthquake Research in the Public Interest No. 201508025.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Li, L., Li, S., Cao, D., Lin, D. (2017). SentiNet: Mining Visual Sentiment from Scratch. In: Angelov, P., Gegov, A., Jayne, C., Shen, Q. (eds) Advances in Computational Intelligence Systems. Advances in Intelligent Systems and Computing, vol 513. Springer, Cham. https://doi.org/10.1007/978-3-319-46562-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-46562-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46561-6
Online ISBN: 978-3-319-46562-3
eBook Packages: EngineeringEngineering (R0)