Abstract
The notion of style in photographs is one that is highly subjective, and often difficult to characterize computationally. Recent advances in learning techniques for visual recognition have encouraged new possibilities for computing aesthetics and other related concepts in images. In this paper, we design an approach for recognizing styles in photographs by introducing adapted deep convolutional neural networks that are attentive towards strong neural activations. The proposed convolutional attentional units act as a filtering mechanism that conserves activations in convolutional blocks in order to contribute more meaningfully towards the visual style classes. State-of-the-art results were achieved on two large image style datasets, demonstrating the effectiveness of our method.
This work is supported in part by Shanghai ‘Belt and Road’ Young Scholar Exchange Grant (17510740100), Shanghai Jiao Tong University and Multimedia University.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
(a) For pre-trained ResNet-50, we perform sample normalization on the extracted avg_pool features. (b) For fine-tuning ResNet-50, samples are mean subtracted.
References
Amirshahi, S.A., Hayn-Leichsenring, G.U., Denzler, J., Redies, C.: JenAesthetics subjective dataset: analyzing paintings by subjective scores. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 3–19. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_1
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Bar, Y., Levy, N., Wolf, L.: Classification of artistic styles using binarized features derived from a deep neural network. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 71–84. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_5
Bhattacharya, S., Sukthankar, R., Shah, M.: A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 271–280. ACM (2010)
Chu, W.T., Wu, Y.L.: Deep correlation features for image style classification. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 402–406. ACM (2016)
Fang, C., Lin, Z., Mech, R., Shen, X.: Automatic image cropping using visual composition, boundary simplicity and content preservation models. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1105–1108. ACM (2014)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)
Hii, Y.L., See, J., Kairanbay, M., Wong, L.K.: Multigap: multi-pooled inception network with text augmentation for aesthetic prediction of photographs. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 1722–1726. IEEE (2017)
Kairanbay, M., See, J., Wong, L.-K.: Aesthetic evaluation of facial portraits using compositional augmentation for deep CNNs. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10117, pp. 462–474. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54427-4_34
Karayev, S., et al.: Recognizing image style. arXiv preprint arXiv:1311.3715 (2013)
Kiapour, M.H., Yamaguchi, K., Berg, A.C., Berg, T.L.: Hipster wars: discovering elements of fashion styles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 472–488. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_31
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Liu, L., Chen, R., Wolf, L., Cohen-Or, D.: Optimizing photo composition. Comput. Graph. Forum 29(2), 469–478 (2010)
Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z.: Rapid: rating pictorial aesthetics using deep learning. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 457–466. ACM (2014)
Lu, X., Lin, Z., Shen, X., Mech, R., Wang, J.Z.: Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: International Conference on Computer Vision, pp. 990–998 (2015)
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Murray, N., Marchesotti, L., Perronnin, F.: AVA: a large-scale database for aesthetic visual analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2408–2415. IEEE (2012)
Saleh, B., Elgammal, A.: Large-scale classification of fine-art paintings: learning the right metric on the right feature. arXiv preprint arXiv:1505.00855 (2015)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: International Conference on Computer Vision, pp. 618–626. IEEE (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Tai, Y., Yang, J., Liu, X.: Image super-resolution via deep recursive residual network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2790–2798. IEEE (2017)
Tan, W.R., Chan, C.S., Aguirre, H.E., Tanaka, K.: Ceci n’est pas une pipe: a deep convolutional network for fine-art paintings classification. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3703–3707. IEEE (2016)
Wilber, M.J., Fang, C., Jin, H., Hertzmann, A., Collomosse, J., Belongie, S.J.: BAM! The behance artistic media dataset for recognition beyond photography. In: International Conference on Computer Vision, pp. 1211–1220 (2017)
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
See, J., Wong, LK., Kairanbay, M. (2019). Paying Attention to Style: Recognizing Photo Styles with Convolutional Attentional Units. In: Carneiro, G., You, S. (eds) Computer Vision – ACCV 2018 Workshops. ACCV 2018. Lecture Notes in Computer Science(), vol 11367. Springer, Cham. https://doi.org/10.1007/978-3-030-21074-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-21074-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21073-1
Online ISBN: 978-3-030-21074-8
eBook Packages: Computer ScienceComputer Science (R0)