Skip to main content

Paying Attention to Style: Recognizing Photo Styles with Convolutional Attentional Units

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11367))

Abstract

The notion of style in photographs is one that is highly subjective, and often difficult to characterize computationally. Recent advances in learning techniques for visual recognition have encouraged new possibilities for computing aesthetics and other related concepts in images. In this paper, we design an approach for recognizing styles in photographs by introducing adapted deep convolutional neural networks that are attentive towards strong neural activations. The proposed convolutional attentional units act as a filtering mechanism that conserves activations in convolutional blocks in order to contribute more meaningfully towards the visual style classes. State-of-the-art results were achieved on two large image style datasets, demonstrating the effectiveness of our method.

This work is supported in part by Shanghai ‘Belt and Road’ Young Scholar Exchange Grant (17510740100), Shanghai Jiao Tong University and Multimedia University.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    (a) For pre-trained ResNet-50, we perform sample normalization on the extracted avg_pool features. (b) For fine-tuning ResNet-50, samples are mean subtracted.

References

  1. Amirshahi, S.A., Hayn-Leichsenring, G.U., Denzler, J., Redies, C.: JenAesthetics subjective dataset: analyzing paintings by subjective scores. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 3–19. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_1

    Chapter  Google Scholar 

  2. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  3. Bar, Y., Levy, N., Wolf, L.: Classification of artistic styles using binarized features derived from a deep neural network. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 71–84. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_5

    Chapter  Google Scholar 

  4. Bhattacharya, S., Sukthankar, R., Shah, M.: A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 271–280. ACM (2010)

    Google Scholar 

  5. Chu, W.T., Wu, Y.L.: Deep correlation features for image style classification. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 402–406. ACM (2016)

    Google Scholar 

  6. Fang, C., Lin, Z., Mech, R., Shen, X.: Automatic image cropping using visual composition, boundary simplicity and content preservation models. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1105–1108. ACM (2014)

    Google Scholar 

  7. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)

    Google Scholar 

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  9. Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)

    Google Scholar 

  10. Hii, Y.L., See, J., Kairanbay, M., Wong, L.K.: Multigap: multi-pooled inception network with text augmentation for aesthetic prediction of photographs. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 1722–1726. IEEE (2017)

    Google Scholar 

  11. Kairanbay, M., See, J., Wong, L.-K.: Aesthetic evaluation of facial portraits using compositional augmentation for deep CNNs. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10117, pp. 462–474. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54427-4_34

    Chapter  Google Scholar 

  12. Karayev, S., et al.: Recognizing image style. arXiv preprint arXiv:1311.3715 (2013)

  13. Kiapour, M.H., Yamaguchi, K., Berg, A.C., Berg, T.L.: Hipster wars: discovering elements of fashion styles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 472–488. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_31

    Chapter  Google Scholar 

  14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  15. Liu, L., Chen, R., Wolf, L., Cohen-Or, D.: Optimizing photo composition. Comput. Graph. Forum 29(2), 469–478 (2010)

    Article  Google Scholar 

  16. Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z.: Rapid: rating pictorial aesthetics using deep learning. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 457–466. ACM (2014)

    Google Scholar 

  17. Lu, X., Lin, Z., Shen, X., Mech, R., Wang, J.Z.: Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: International Conference on Computer Vision, pp. 990–998 (2015)

    Google Scholar 

  18. Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)

    Google Scholar 

  19. Murray, N., Marchesotti, L., Perronnin, F.: AVA: a large-scale database for aesthetic visual analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2408–2415. IEEE (2012)

    Google Scholar 

  20. Saleh, B., Elgammal, A.: Large-scale classification of fine-art paintings: learning the right metric on the right feature. arXiv preprint arXiv:1505.00855 (2015)

  21. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: International Conference on Computer Vision, pp. 618–626. IEEE (2017)

    Google Scholar 

  22. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  23. Tai, Y., Yang, J., Liu, X.: Image super-resolution via deep recursive residual network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2790–2798. IEEE (2017)

    Google Scholar 

  24. Tan, W.R., Chan, C.S., Aguirre, H.E., Tanaka, K.: Ceci n’est pas une pipe: a deep convolutional network for fine-art paintings classification. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3703–3707. IEEE (2016)

    Google Scholar 

  25. Wilber, M.J., Fang, C., Jin, H., Hertzmann, A., Collomosse, J., Belongie, S.J.: BAM! The behance artistic media dataset for recognition beyond photography. In: International Conference on Computer Vision, pp. 1211–1220 (2017)

    Google Scholar 

  26. Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)

    Google Scholar 

  27. Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John See .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

See, J., Wong, LK., Kairanbay, M. (2019). Paying Attention to Style: Recognizing Photo Styles with Convolutional Attentional Units. In: Carneiro, G., You, S. (eds) Computer Vision – ACCV 2018 Workshops. ACCV 2018. Lecture Notes in Computer Science(), vol 11367. Springer, Cham. https://doi.org/10.1007/978-3-030-21074-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-21074-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-21073-1

  • Online ISBN: 978-3-030-21074-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics