KI - Künstliche Intelligenz

, Volume 27, Issue 1, pp 25–35 | Cite as

Can Computers Learn from the Aesthetic Wisdom of the Crowd?

Technical Contribution

Abstract

The social media revolution has led to an abundance of image and video data on the Internet. Since this data is typically annotated, rated, or commented upon by large communities, it provides new opportunities and challenges for computer vision. Social networking and content sharing sites seem to hold the key to the integration of context and semantics into image analysis. In this paper, we explore the use of social media in this regard. We present empirical results obtained on a set of 127,593 images with 3,741,176 tag assignments that were harvested from Flickr, a photo sharing site. We report on how users tag and rate photos and present an approach towards automatically recognizing the aesthetic appeal of images using confidence-based classifiers to alleviate effects due to ambiguously labeled data. Our results indicate that user generated content allows for learning about aesthetic appeal. In particular, established low-level image features seem to enable the recognition of beauty. A reliable recognition of unseemliness, on the other hand, appears to require more elaborate high-level analysis.

References

  1. 1.
    Anderson A, Huttenlocher D, Kleinberg J, Leskovec J (2012) Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: ACM KDD Google Scholar
  2. 2.
    Bauckhage C (2011) Insights into Internet memes. In: AAAI ICWSM Google Scholar
  3. 3.
    Bauckhage C, Alpcan T, Wetzker R, Umbrath W (2008) Image retrieval and web 2.0—where can we go from here? In: IEEE ICIP Google Scholar
  4. 4.
    Black J, Kahol K, Trapathi P, Kuchi P, Panchanathan S (2004) Indexing natural image for retrieval based on Kansei factors. In: Human vision and electronic imaging IX. Proc SPIE, vol 5292 Google Scholar
  5. 5.
    Bollen J, Mao H, Pepe A (2011) Modeling public mood and emotion: twitter sentiment and socio-economic phenomena. In: AAAI ICWSM Google Scholar
  6. 6.
    Clauset A, Shalizi C, Newman M (2007) Power-law distributions in empirical data. SIAM Rev 51(4):51–94 MathSciNetGoogle Scholar
  7. 7.
    Clemens B, Rosenfeld D (1979) Photographic composition. Van Nostrand Reinhold Company, New York Google Scholar
  8. 8.
    Conover M, Ratkiewicz J, Francisco M, Goncalves B, Flammini A, Menczer F (2011) Political polarization on twitter. In: AAAI ICWSM Google Scholar
  9. 9.
    Datta R, Fedorovskaya E, Luong QT, Wang J, Li J, Luo J (2001) Aesthetics and emotions in images. IEEE Signal Process Mag 28(5):94–115 Google Scholar
  10. 10.
    Datta R, Joshi D, Li J, Wang J (2006) Studying aesthetics in photographic images using a computational approach. In: ECCV Google Scholar
  11. 11.
    Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE CVPR Google Scholar
  12. 12.
    Dewey T, Kaden J, Marks M, Matsushima S, Zhu B (2012) The impact of social media on social unrest in the Arab Spring. Tech rep, Stanford University Google Scholar
  13. 13.
    Downey A (2005) Lognormal and Pareto distributions in the Internet. Comput Commun 28(7):790–801 CrossRefGoogle Scholar
  14. 14.
    Dunker P, Nowak S, Begau A, Lanz C (2008) Content-based mood classification for photos and music: a generic multi-modal classification framework and evaluation approach. In: ACM MIR Google Scholar
  15. 15.
    Everingham M, Van Gool L, Williams C, Winn J, Zisserman A (2010) The Pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338 CrossRefGoogle Scholar
  16. 16.
    Gantz J, Reinsel D (2011) IDC iView: extracting value from chaos. Tech rep, EMC Corporation Google Scholar
  17. 17.
    Halevy A, Norvig P, Pereira F (2009) The unreasonable effectiveness of data. IEEE Intell Syst 24(2):8–12 CrossRefGoogle Scholar
  18. 18.
    Hendler J, Shadbolt N, Hall W, Berners-Lee T, Weitzner D (2008) Web science: an interdisciplinary approach to understanding the web. Commun ACM 51(7):60–69 CrossRefGoogle Scholar
  19. 19.
    Hey T, Tanslev S, Tolle K (eds) (2009) The fourth paradigm: data-Intensive scientific discovery. Microsoft Research Google Scholar
  20. 20.
    Hill C, Helmers M (eds) (2004) Defining visual rhetorics. Lawrence Erlbaum Associates, Hillsdale Google Scholar
  21. 21.
    Hogg J (ed) (1969) Psychology and the visual arts. Penguin Books, Baltimore Google Scholar
  22. 22.
    Hyvärinen A, Hurri J, Hoyer P (2009) Natural image statistics. Springer, Berlin MATHCrossRefGoogle Scholar
  23. 23.
    Jansen B (2011) Classifying ecommerce information sharing behaviour by youths on social networking sites. J Inf Sci 37(2):120–136 MathSciNetCrossRefGoogle Scholar
  24. 24.
    Jisun A, Cha M, Gummadi K, Crowcroft J (2011) Media landscape in twitter: a world of new conventions and political diversity. In: AAAI ICWSM Google Scholar
  25. 25.
    Kato T (1992) Database architecture for content-based image retrieval. In: Image storage and retrieval systems. Proc SPIE, vol 1662 CrossRefGoogle Scholar
  26. 26.
    Kirkpatrick M (2010) Google CEO Schmidt: “People aren’t ready for the technology revolution”. http://readwriteweb.com
  27. 27.
    Koch M, Denzler J, Redies C (2010) 1/f 2 characteristics and isotropy in the Fourier power spectra of visual art, cartoons, comics, mangas, and different categories of photographs. PLoS ONE 5(8):e12268 CrossRefGoogle Scholar
  28. 28.
    Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the omg! In: AAAI ICWSM Google Scholar
  29. 29.
    Kunegis J, Lommatzsch A, Bauckhage C (2009) The slashdot zoo: mining a social network with negative edges. In: ACM WWW Google Scholar
  30. 30.
    Lanagan J, Smeaton A (2011) Using twitter to detect and tag important events in live sports. In: AAAI ICWSM Google Scholar
  31. 31.
    Leder H, Belke B, Oberst A, Augustin D (2004) A model of aesthetic appreciation and aesthetic judgements. Br J Psychol 95(4) Google Scholar
  32. 32.
    Leskovec J, Adamic L, Huberman B (2007) The dynamics of viral marketing. ACM Tans Web 1(1):5 CrossRefGoogle Scholar
  33. 33.
    Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: ACM KDD Google Scholar
  34. 34.
    Maquet A (1988) The aesthetic experience: an anthropologist looks at the visual arts. Yale University Press, New Haven Google Scholar
  35. 35.
    Meeder B, Karrer B, Sayedi A, Ravi R, Borgs C, Chayes J (2011) We know who you followed last summer: inferring social link creation times in twitter. In: ACM WWW Google Scholar
  36. 36.
    Mitzenmacher M (2004) A brief history of generative models for power law and lognormal distributions. Internet Math 1(2):226–251 MathSciNetMATHCrossRefGoogle Scholar
  37. 37.
    Naveed N, Sizov S, Staab S (2011) Att: analyzing temporal dynamics of topics and authors in social media. In: ACM WebSci Google Scholar
  38. 38.
    Obrador P, Moroney N (2009) Low level features for image appeal measurement. In: Farnand S, Gaykema F (eds) Image quality and system performance. Proc SPIE, vol 7242 Google Scholar
  39. 39.
    Oliva A, Torralba A (2006) Building the gist of a scene: the role of global image features in recognition. In: Progress in brain research. Elsevier, Amsterdam Google Scholar
  40. 40.
    Paul M, Dredze M (2011) You are what you tweet: analyzing twitter for public health. In: AAAI ICWSM Google Scholar
  41. 41.
    Peters G (2007) Aesthetic primitives of images for visualization. In: IEEE IV Google Scholar
  42. 42.
    Peterson E (2006) Beneath the metadata: some philosophical problems with folksonomy. D-Lib Mag 12(11) Google Scholar
  43. 43.
    Redies C, Hänisch J, Blickhan M, Denzler J (2007) Artists portray human faces with the Fourier statistics of complex natural scenes. Network 18(3):235–248 CrossRefGoogle Scholar
  44. 44.
    Romero D, Galuba W, Asur S, Huberman B (2011) Influence and passivity in social media. In: ACM WWW Google Scholar
  45. 45.
    Romero D, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: ACM WWW Google Scholar
  46. 46.
    Signorini A, Segre A, Polgreen P (2011) The use of twitter to track levels of disease activity and public concern in the US during the influenza a h1n1 pandemic. PLoS ONE 6(5):e19467 CrossRefGoogle Scholar
  47. 47.
    Solso R (1996) Cognition and the visual arts. MIT Press, Cambridge Google Scholar
  48. 48.
    Spehr M, Wallraven C, Fleming R (2009) Image statistics for clustering paintings according to their visual appearance. In: Int symp comp aesthetics in graphics, visualization, and imaging Google Scholar
  49. 49.
    Thurau C, Bauckhage C (2009) Archetypal images in large photo collections. In: IEEE ICSC Google Scholar
  50. 50.
    Tong H, Li M, Zhang HJ, He J, Zhang C (2004) Classification of digital photos taken by photographers or home users. In: Pacific Rim conf multimedia Google Scholar
  51. 51.
    Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970 CrossRefGoogle Scholar
  52. 52.
    Ugander J, Backstrom L, Marlow C, Kleinberg J (2012) Structural diversity in social contagion. Proc Natl Acad Sci 109(16):5962–5966 CrossRefGoogle Scholar
  53. 53.
    Weng J, Lee F (2011) Event detection in twitter. In: AAAI ICWSM Google Scholar
  54. 54.
    Wetzker R, Alpcan T, Bauckhage C, Umbrath W, Albayrak S (2007) An unsupervised hierarchical method for automated document categorization. In: IEEE/WIC/ACM WI Google Scholar
  55. 55.
    Wetzker R, Zimmermann C, Bauckhage C (2010) Detecting trends in social bookmarking systems: a del.icio.us endeavor. Int J Data Warehous Min 6(1):38–57 CrossRefGoogle Scholar
  56. 56.
    Wetzker R, Zimmermann C, Bauckhage C, Albayrak S (2010) I tag, you tag: translating tags for advanced user models. In: ACM WSDM Google Scholar
  57. 57.
    Wong LK, Low KL (2009) Saliency-enhanced image aesthetic classification. In: IEEE ICIP Google Scholar
  58. 58.
    Wu F, Huberman B (2007) Novelty and collective attention. Proc Natl Acad Sci 104(45):17599–17601 CrossRefGoogle Scholar
  59. 59.
    Wu S, Hofman J, Mason W, Watts D (2011) Who says what to whom on twitter. In: ACM WWW Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.B-ITUniversity of BonnBonnGermany
  2. 2.IGGUniversity of BonnBonnGermany
  3. 3.Fraunhofer IAISSankt AugustinGermany

Personalised recommendations