Skip to main content
Log in

Discovering Beautiful Attributes for Aesthetic Image Analysis

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Aesthetic image analysis is the study and assessment of the aesthetic properties of images. Current computational approaches to aesthetic image analysis either provide accurate or interpretable results. To obtain both accuracy and interpretability by humans, we advocate the use of learned and nameable visual attributes as mid-level features. For this purpose, we propose to discover and learn the visual appearance of attributes automatically, using a recently introduced database, called AVA, which contains more than 250,000 images together with their aesthetic scores and textual comments given by photography enthusiasts. We provide a detailed analysis of these annotations as well as the context in which they were given. We then describe how these three key components of AVA—images, scores, and comments—can be effectively leveraged to learn visual attributes. Lastly, we show that these learned attributes can be successfully used in three applications: aesthetic quality prediction, image tagging and retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. http://www.dpchallenge.com/help_faq.php#howcomments.

  2. http://www.dpchallenge.com/forum.php?action=read%26FORUM_THREAD_ID=19842.

    Table 4 Statistics on comments in AVA
  3. http://www.crowdflower.com/.

References

  • “aesthetics” E .(2012). The American Heritage\({\textregistered }\) Dictionary of the English Language, Fourth Edition.

  • Akata, Z., Perronnin, F., Harchaoui, Z., & Schmid, C. (2014). Good practice in large-scale learning for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3), 507–520.

    Article  Google Scholar 

  • Bekkerman, R., & Allan, J. (2004). Using bigrams in text categorization. Technical Report IR-408 Department of Computer Science, University of Massachusetts, Amherst, MA.

  • Berg, A. C., Berg, T. L., Daume, H., Dodge, J., Goyal, A., Han, X., Mensch, A., Mitchell, M., Sood, A., & Stratos, K., et al. (2012). Understanding and predicting importance in images. In CVPR, pp. 3562–3569.

  • Berg, T., Berg, A., & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In ECCV.

  • Bottou, L., & Bousquet, O. (2007). The tradeoffs of large scale learning. In NIPS.

  • Chatfield, K., Lempitsky, V., Vedaldi, A., & Zisserman, A. (2011). The devil is in the details: An evaluation of recent feature encoding methods. In BMVC.

  • Chatterjee, A. (2011). Neuroaesthetics: A coming of age story. Journal of Cognitive Neuroscience, 23(1), 53–62.

    Article  Google Scholar 

  • Clinchant, S., Csurka, G., Perronnin, F., & Renders, J. M. (2007). Xrce participation to ImageEval. In ImageEval Workshop at CVIR.

  • Cramer, D., & Howitt, D. (2004). The SAGE dictionary of statistics, 1st Edn. SAGE, p. 21 (entry “ceiling effect”), p. 67 (entry “floor effect”).

  • Crammer, K., & Singer, Y. (2002). On the algorithmic implementation of multiclass kernel-based vector machines. The Journal of Machine Learning Research, 2, 265–292.

    MATH  Google Scholar 

  • Csurka, G., Dance, C., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints. In ECCV SLCV Workshop.

  • Datta, R., & Wang, J. Z. (2010). Acquine: Aesthetic quality inference engine—real-time automatic rating of photo aesthetics. In MIR.

  • Datta, R., Joshi, D., Li, J., & Wang, J. Z. (2006). Studying aesthetics in photographic images using a computational approach. In ECCV.

  • Datta, R., Joshi, D., Li, J., & Wang, J. Z. (2008). Algorithmic inferencing of aesthetics and emotion in natural images: An exposition. In ICIP.

  • Dhar, S., Ordonez, V., & Berg, T. (2011). High level describable attributes for predicting aesthetics and interestingness. In CVPR.

  • Donahue, J., & Grauman, K. (2011). Annotator rationales for visual recognition. In ICCV.

  • Duan, K., Parikh, D., Crandall, D., & Grauman, K. (2012). Discovering localized attributes for fine-grained recognition. In CVPR.

  • Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. (2009). Describing objects by their attributes. In CVPR.

  • Ferrari, V., & Zisserman, A. (2007). Learning visual attributes. In NIPS.

  • Geng, B., Yang, L., Xu, C., Hua, X., & Li, S. (2011). The role of attractiveness in web image search. In ACM-MM.

  • Gracyk, T. (2011). Hume’s aesthetics. In: E. N. Zalta (Ed.) The Stanford encyclopedia of philosophy, winter 2011 edn.

  • Hammermeister, K. (2002). The German aesthetic tradition. Cambridge, MA: Cambridge University Press.

    Book  Google Scholar 

  • Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42, 177–196.

  • Isola, P., Parikh, D., Torralba, A., & Oliva, A. (2011). Understanding the intrinsic memorability of images. In NIPS.

  • Jacobson, E., & Ostwald, W. (1946). The color harmony manual, large chip edition. Chicago: Container Corporation.

  • Jégou, H., Douze, M., & Schmid, C. (2011). Product quantization for nearest neighbor search. IEEE TPAMI.

  • Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In ECML.

  • Joshi, D., Datta, R., Fedorovskaya, E., Luong, Q., Wang, J., Li, J., et al. (2011). Aesthetics and emotions in images. IEEE on Signal Processing Magazine, 28(5), 94–115.

    Article  Google Scholar 

  • Ke, Y., Tang, X., & Jing, F. (2006). The design of high-level features for photo quality assessment. In CVPR.

  • Kodak. (1987). How to take good pictures: A photo guide (35th ed.). New York, NY: Ballantine Books.

  • Krages, B. (2005). Photography: The art of composition. New York, US: Allworth Press.

  • Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In CVPR.

  • Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR.

  • Leder, H., Belke, B., Oeberst, A., & Augustin, D. (2004). A model of aesthetic appreciation and aesthetic judgments. British Journal of Psychology, 95(4), 489–508.

    Article  Google Scholar 

  • Li, C., Loui, A. C., & Chen, T. (2010). Towards aesthetics: A photo quality assessment and photo selection system. In ACM-MM.

  • Lowe, D. (1999). Object recognition from local scale-invariant features. In ICCV.

  • Luo, W., Wang, X., & Tang, X. (2011). Content-based photo quality assessment. In ICCV.

  • Luo, Y., & Tang, X. (2008). Photo and video quality evaluation: Focusing on the subject. In ECCV.

  • Machajdik, J., & Hanbury, A. (2010). Affective image classification using features inspired by psychology and art theory. In ACM MM, New York, NY, USA.

  • Marchesotti, L., & Perronnin, F. (2013). Learning beautiful (and ugly) attributes. In BMVC.

  • Marchesotti, L., Perronnin, F., Larlus, D., & Csurka, G. (2011). Assessing the aesthetic quality of photographs using generic image descriptors. In ICCV.

  • Müller, H., Clough, P., Deselaers, T., & Caputo, B. (2010). ImageCLEF: Experimental evaluation in visual information retrieval (Vol. 32). Berlin: Springer.

    Google Scholar 

  • Murray, N., Marchesotti, L., & Perronnin, F. (2012a). AVA: A large-scale database for aesthetic visual analysis. In CVPR.

  • Murray, N., Marchesotti, L., & Perronnin, F. (2012b) Learning to rank images using semantic and aesthetic labels. In BMVC.

  • Ng, A. Y., Jordan, M. I., & Weiss, Y., et al. (2002). On spectral clustering: Analysis and an algorithm. In NIPS.

  • Obrador, P., Schmidt-Hackenberg, L., & Oliver, N. (2010). The role of image composition in image aesthetics. In ICIP.

  • Obrador, P., Saad, M., Suryanarayan, P., & Oliver, N. (2012). Towards category-based aesthetic models of photographs. Advances in Multimedia Modeling, pp. 63–76.

  • Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. In IJCV.

  • Orendovici, R., & Wang, J. (2010). Training data collection system for a learning-based photographic aesthetic quality inference engine. In ACM-MM.

  • Pang, B., Lee, L., & Vaithyanathan, S. (2012). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing.

  • Parikh, D., & Grauman, K. (2011a). Interactively building a discriminative vocabulary of nameable attributes. In CVPR.

  • Parikh, D., & Grauman, K. (2011b). Relative attributes. In ICCV.

  • Perronnin, F., & Dance, C. (2007). Fisher kernels on visual vocabularies for image categorization. In CVPR.

  • Perronnin, F., Sánchez, J., & Mensink, T. (2010). Improving the fisher kernel for large-scale image classification. In ECCV.

  • Riloff, E., Patwardhan, S., & Wiebe, J., et al. (2006). Feature subsumption for opinion analysis. In Proceedings of the 2006 conference on empirical methods in natural language processing.

  • Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., & Schiele, B. (2010). What helps where-and why? Semantic relatedness for knowledge transfer. In CVPR.

  • Russell, J. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161–1178.

  • San Pedro, J., Yeh, T., & Oliver, N. (2012). Leveraging user comments for aesthetic aware image search reranking. In WWW.

  • Shelley, J. (2012a). 18th century british aesthetics. In: E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy, summer 2012 edn.

  • Shelley, J. (2012b). The concept of the aesthetic. In: E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy, spring 2012 edn.

  • Sivic, J., & Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. In ICCV.

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 58, 267–288.

  • Wang, J., Markert, K., & Everingham, M. (2009). Learning models for object recognition from natural language descriptions. In BMVC.

  • Yanai, K., & Barnard, K. (2005). Image region entropy: A measure of visualness of web images associated with one concept. In ACM-MM.

  • Yao, L., Suryanarayan, P., Qiao, M., Wang, J., & Li, J. (2012). On-site composition and aesthetics feedback through exemplars for photographers. In IJCV, Oscar.

  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, 67, 301–320.

Download references

Acknowledgments

The authors would like to thank Jean-Michel Renders for the discussions about text analysis and Isaac Alonso for having supported the experimental work of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naila Murray.

Additional information

Communicated by Michael Valstar, Andrew French, and Tony Pridmore.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marchesotti, L., Murray, N. & Perronnin, F. Discovering Beautiful Attributes for Aesthetic Image Analysis. Int J Comput Vis 113, 246–266 (2015). https://doi.org/10.1007/s11263-014-0789-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0789-2

Keywords

Navigation