Paying Attention to Style: Recognizing Photo Styles with Convolutional Attentional Units

See, John; Wong, Lai-Kuan; Kairanbay, Magzhan

doi:10.1007/978-3-030-21074-8_10

Paying Attention to Style: Recognizing Photo Styles with Convolutional Attentional Units

John See^16,17,
Lai-Kuan Wong¹⁶ &
Magzhan Kairanbay¹⁶

Conference paper
First Online: 19 June 2019

1618 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11367))

Abstract

The notion of style in photographs is one that is highly subjective, and often difficult to characterize computationally. Recent advances in learning techniques for visual recognition have encouraged new possibilities for computing aesthetics and other related concepts in images. In this paper, we design an approach for recognizing styles in photographs by introducing adapted deep convolutional neural networks that are attentive towards strong neural activations. The proposed convolutional attentional units act as a filtering mechanism that conserves activations in convolutional blocks in order to contribute more meaningfully towards the visual style classes. State-of-the-art results were achieved on two large image style datasets, demonstrating the effectiveness of our method.

This work is supported in part by Shanghai ‘Belt and Road’ Young Scholar Exchange Grant (17510740100), Shanghai Jiao Tong University and Multimedia University.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
(a) For pre-trained ResNet-50, we perform sample normalization on the extracted avg_pool features. (b) For fine-tuning ResNet-50, samples are mean subtracted.

References

Amirshahi, S.A., Hayn-Leichsenring, G.U., Denzler, J., Redies, C.: JenAesthetics subjective dataset: analyzing paintings by subjective scores. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 3–19. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_1
Chapter Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Bar, Y., Levy, N., Wolf, L.: Classification of artistic styles using binarized features derived from a deep neural network. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 71–84. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_5
Chapter Google Scholar
Bhattacharya, S., Sukthankar, R., Shah, M.: A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 271–280. ACM (2010)
Google Scholar
Chu, W.T., Wu, Y.L.: Deep correlation features for image style classification. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 402–406. ACM (2016)
Google Scholar
Fang, C., Lin, Z., Mech, R., Shen, X.: Automatic image cropping using visual composition, boundary simplicity and content preservation models. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1105–1108. ACM (2014)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)
Google Scholar
Hii, Y.L., See, J., Kairanbay, M., Wong, L.K.: Multigap: multi-pooled inception network with text augmentation for aesthetic prediction of photographs. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 1722–1726. IEEE (2017)
Google Scholar
Kairanbay, M., See, J., Wong, L.-K.: Aesthetic evaluation of facial portraits using compositional augmentation for deep CNNs. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10117, pp. 462–474. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54427-4_34
Chapter Google Scholar
Karayev, S., et al.: Recognizing image style. arXiv preprint arXiv:1311.3715 (2013)
Kiapour, M.H., Yamaguchi, K., Berg, A.C., Berg, T.L.: Hipster wars: discovering elements of fashion styles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 472–488. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_31
Chapter Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Liu, L., Chen, R., Wolf, L., Cohen-Or, D.: Optimizing photo composition. Comput. Graph. Forum 29(2), 469–478 (2010)
Article Google Scholar
Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z.: Rapid: rating pictorial aesthetics using deep learning. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 457–466. ACM (2014)
Google Scholar
Lu, X., Lin, Z., Shen, X., Mech, R., Wang, J.Z.: Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: International Conference on Computer Vision, pp. 990–998 (2015)
Google Scholar
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Google Scholar
Murray, N., Marchesotti, L., Perronnin, F.: AVA: a large-scale database for aesthetic visual analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2408–2415. IEEE (2012)
Google Scholar
Saleh, B., Elgammal, A.: Large-scale classification of fine-art paintings: learning the right metric on the right feature. arXiv preprint arXiv:1505.00855 (2015)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: International Conference on Computer Vision, pp. 618–626. IEEE (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Tai, Y., Yang, J., Liu, X.: Image super-resolution via deep recursive residual network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2790–2798. IEEE (2017)
Google Scholar
Tan, W.R., Chan, C.S., Aguirre, H.E., Tanaka, K.: Ceci n’est pas une pipe: a deep convolutional network for fine-art paintings classification. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3703–3707. IEEE (2016)
Google Scholar
Wilber, M.J., Fang, C., Jin, H., Hertzmann, A., Collomosse, J., Belongie, S.J.: BAM! The behance artistic media dataset for recognition beyond photography. In: International Conference on Computer Vision, pp. 1211–1220 (2017)
Google Scholar
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)
Google Scholar
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computing and Informatics, Multimedia University, Persiaran Multimedia, 63100, Cyberjaya, Selangor, Malaysia
John See, Lai-Kuan Wong & Magzhan Kairanbay
Shanghai Jiao Tong University, Shanghai, 200240, China
John See

Authors

John See
View author publications
You can also search for this author in PubMed Google Scholar
Lai-Kuan Wong
View author publications
You can also search for this author in PubMed Google Scholar
Magzhan Kairanbay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to John See .

Editor information

Editors and Affiliations

School of Computer Science, University of Adelaide, Adelaide, Australia
Gustavo Carneiro
Data61, Commonwealth Scientific and Industrial Research Organization, Canberra, Australia
Shaodi You

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

See, J., Wong, LK., Kairanbay, M. (2019). Paying Attention to Style: Recognizing Photo Styles with Convolutional Attentional Units. In: Carneiro, G., You, S. (eds) Computer Vision – ACCV 2018 Workshops. ACCV 2018. Lecture Notes in Computer Science(), vol 11367. Springer, Cham. https://doi.org/10.1007/978-3-030-21074-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-21074-8_10
Published: 19 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21073-1
Online ISBN: 978-3-030-21074-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics