Abstract
We design a deep learning framework that generates landscape images that match a given emotion. We are working on a more challenging approach to generate landscape scenes that do not have main objects making it easier to recognize the emotion. To solve this problem, deep networks based on generative adversarial networks are proposed. A new residual unit called emotional residual unit (ERU) is proposed to better reflect the emotion on training. An affective feature matching loss (AFM-loss) optimized for the emotional image generation is also proposed. This approach produced better images according to the given emotions. To demonstrate performance of the proposed model, a set of experiments including user studies was conducted. The results reveal a higher preference in the new model than the previous ones, demonstrating the production of images suitable for the given emotions. Ablation studies demonstrate that the ERU and AFM-loss enhanced the performance of the model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yang, J., She, D., Sun, M.: Joint image emotion classification and distribution learning via deep convolutional neural network. IJCA I, 3266–3272 (2017)
Shi, C., Pun, C.: Multiscale superpixel-based hyperspectral image classification using recurrent neural networks with stacked autoencoders. IEEE Trans. Multimed. 22, 487–501 (2019)
Lyu, F., Wu, Q., Hu, F., Wu, Q., Tan, M.: Attend and imagine: multi-label image classification with visual attention and recurrent neural networks. IEEE Trans. Multimed. 21, 1971–1981 (2019)
Dong, L., et al.: CUNet: a compact unsupervised network for image classification. IEEE Trans. Multimed. 20, 2012–2021 (2018)
Wu, S., Ji, Q., Wang, S., Wong, H.S., Yu, Z., Xu, Y.: Semi-supervised image classification with self-paced cross-task networks. IEEE Trans. Multimed. 20, 851–865 (2018)
Wang, F., et al.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)
Huang, G., Liu, Z., Maaten, L.V.d., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)
Fu, K., Zhao, Q., Gu, I.Y.: Refinet: a deep segmentation assisted refinement network for salient object detection. IEEE Trans. Multimed. 21, 457–469 (2019)
Chen, C., Ling, Q.: Adaptive convolution for object detection. IEEE Trans. Multimed. 21, 3205–3217 (2019)
Tang, Y., Wu, X.: Scene text detection using superpixel-based stroke feature transform and deep learning based region classification. IEEE Trans. Multimed. 20, 2276–2288 (2018)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014)
Chen, L., Wu, L., Hu, Z., Wang, M.: Quality-aware unpaired image-to-image translation. IEEE Trans. Multimed. 21, 2664–2674 (2019)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2015)
Zhao, J., Mathieu, M., LeCun, Y.: Energy-based generative adversarial network (2016)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223 (2017)
Berthelot, D., Schumm, T., Metz, L.: BEGAN: boundary equilibrium generative adversarial networks (2017)
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Guo, Y., Chen, Q., Chen, J., Wu, Q., Shi, Q., Tan, M.: Auto-embedding generative adversarial networks for high resolution image synthesis. IEEE Trans. Multimed. 21, 2726–2737 (2019)
Xu, W., Keshmiri, S., Wang, G.R.: Adversarially approximated autoencoder for image generation and manipulation. IEEE Trans. Multimed. 21, 2387–2396 (2019)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks (2018)
Johnson, J., Gupta, A., Fei-Fei, L.: Image generation from scene graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1219–1228 (2018)
Hong, S., Yang, D., Choi, J., Lee, H.: Inferring semantic layout for hierarchical text-to-image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7986–7994 (2018)
Tan, F., Feng, S., Ordonez, V.: Text2Scene: generating compositional scenes from textual descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6710–6719 (2019)
Zhao, S., Gao, Y., Jiang, X., Yao, H., Chua, T.S., Sun, X.: Exploring principles-of-art features for image emotion recognition. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 47–56. ACM (2014)
Ng, H.W., Nguyen, V.D., Vonikakis, V., Winkler, S.: Deep learning for emotion recognition on small datasets using transfer learning. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 443–449. ACM (2015)
Yu, Z., Zhang, C.: Image based static facial expression recognition with multiple deep network learning. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 435–442. ACM (2015)
Wu, B., Jia, J., Yang, Y., Zhao, P., Tang, J., Tian, Q.: Inferring emotional tags from social images with user demographics. IEEE Trans. Multimed. 19, 1670–1684 (2017)
Kim, H.R., Kim, Y.S., Kim, S.J., Lee, I.K.: Building emotional machines: recognizing image emotions through deep neural networks. IEEE Trans. Multimed. 20, 2980–2992 (2018)
Zhou, Y., Shi, B.E.: Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 370–376. IEEE (2017)
Lu, Y., Tai, Y.W., Tang, C.K.: Attribute-guided face generation using conditional CycleGAN. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 282–297 (2018)
Song, L., Lu, Z., He, R., Sun, Z., Tan, T.: Geometry guided adversarial facial expression synthesis. In: 2018 ACM Multimedia Conference on Multimedia Conference, pp. 627–635. ACM (2018)
Ding, H., Sricharan, K., Chellappa, R.: ExprGAN: facial expression editing with controllable expression intensity. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Yeh, R., Liu, Z., Goldman, D.B., Agarwala, A.: Semantic facial expression editing using autoencoded flow (2016)
Lang, P.J.: Imagery in therapy: an information processing analysis of fear. Behav. Ther. 8, 862–886 (1977)
Zhang, Q., Lee, M.: Emotion development system by interacting with human EEG and natural scene understanding. Cogn. Syst. Res. 14, 37–49 (2012)
Bradley, M.M., Sabatinelli, D., Lang, P.: Emotion and Motivation in the Perceptual Processing of Natural Scenes. MIT Press, Cambridge (2014)
Simola, J., Le Fevre, K., Torniainen, J., Baccino, T.: Affective processing in natural scene viewing: valence and arousal interactions in eye-fixation-related potentials. NeuroImage 106, 21–33 (2015)
Zhao, S., Ding, G., Huang, Q., Chua, T.S., Schuller, B.W., Keutzer, K.: Affective image content analysis: a comprehensive survey. IJCA I, 5534–5541 (2018)
Zhao, S., Yao, H., Gao, Y., Ding, G., Chua, T.S.: Predicting personalized image emotion perceptions in social networks. IEEE Trans. Affect. Comput. 9, 526–540 (2016)
Karacan, L., Akata, Z., Erdem, A., Erdem, E.: Learning to generate images of outdoor scenes from attributes and semantic layouts (2016)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Osgood, C.E., Suci, G.J., Tannenbaum, P.H.: The Measurement of Meaning. Number 47. University of Illinois press, Champaign (1957)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)
Sak, H., Senior, A., Beaufays, F.: Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth Annual Conference of the International Speech Communication Association (2014)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling (2014)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs (2016)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, pp. 5767–5777 (2017)
Geisler, W.S., Perry, J.S.: Statistics for optimal point prediction in natural images. J. Vis. 11, 14 (2011)
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: SUN database: large-scale scene recognition from abbey to zoo. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492. IEEE (2010)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1452–1464 (2017)
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)
Buhrmester, M., Kwang, T., Gosling, S.D.: Amazon’s mechanical turk: a new source of inexpensive, yet high-quality, data? Perspect. Psychol. Sci. 6, 3–5 (2011)
David, A.M., Amores, J.: The emotional GAN : priming adversarial generation of art with emotion. In: NIPS 2017 Workshop (2017)
Kuperman, V., Estes, Z., Brysbaert, M., Warriner, A.B.: Emotion and language: valence and arousal affect word recognition. J. Exp. Psychol. Gen. 143, 1065 (2014)
Kavalerov, I., Czaja, W., Chellappa, R.: cGANs with multi-hinge loss. arXiv preprint arXiv:1912.04216 (2019)
Acknowledgement
This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2020-2018-0-01419) supervised by the IITP (Institute for Information and Communications Technology Planning and Evaluation) and the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2020R1A2C2014622).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Park, C., Lee, IK. (2021). Emotional Landscape Image Generation Using Generative Adversarial Networks. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12625. Springer, Cham. https://doi.org/10.1007/978-3-030-69538-5_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-69538-5_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69537-8
Online ISBN: 978-3-030-69538-5
eBook Packages: Computer ScienceComputer Science (R0)