An Audio-Visual Approach to Music Genre Classification through Affective Color Features

  • Alexander Schindler
  • Andreas Rauber
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9022)


This paper presents a study on classifying music by affective visual information extracted frommusic videos. The proposed audio-visual approach analyzes genre specific utilization of color. A comprehensive set of color specific image processing features used for affect and emotion recognition derived from psychological experiments or art-theory is evaluated in the visual and multi-modal domain against contemporary audio content descriptors. The evaluation of the presented color features is based on comparative classification experiments on the newly introduced ‘Music Video Dataset’. Results show that a combination of the modalities can improve non-timbral and rhythmic features but show insignificant effects on high performing audio features.


Support Vector Machine Random Forest Image Retrieval Music Video Audio Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Crete, F., et al.: The blur effect: Perception and estimation with a new no-reference perceptual blur metric. In: Electronic Imaging 2007, p. 64920 (2007)Google Scholar
  2. 2.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part III. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006)Google Scholar
  3. 3.
    Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Transactions on Multimedia 13(2), 303–319 (2011)CrossRefGoogle Scholar
  4. 4.
    Gillet, O., Essid, S., Richard, G.: On the correlation of automatic audio and visual segmentations of music videos. IEEE Trans. on Circuits and Sys. for Video Tech. (2007)Google Scholar
  5. 5.
    Hanbury, A.: Circular statistics applied to colour images. In: 8th Computer Vision Winter Workshop, vol. 91, pp. 53–71. Citeseer (2003)Google Scholar
  6. 6.
    Itten, J., Van Haagen, E.: The art of color: The subjective experience and objective rationale of color. Van Nostrand Reinhold, New York (1973)Google Scholar
  7. 7.
    Lidy, T., Rauber, A.: Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In: ISMIR (2005)Google Scholar
  8. 8.
    Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proc. Int. Conf. on Multimedia, pp. 83–92 (2010)Google Scholar
  9. 9.
    Manjunath, B.S., Ohm, J.-R., Vasudevan, V.V., Yamada, A.: Color and texture descriptors. IEEE Trans. on Circuits and Sys. for Video Tech. 11(6), 703–715 (2001)CrossRefGoogle Scholar
  10. 10.
    Plataniotis, K.N., Venetsanopoulos, A.N.: Color image proc. and applications (2000)Google Scholar
  11. 11.
    Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40(2), 99–121 (2000)CrossRefzbMATHGoogle Scholar
  12. 12.
    Schettini, R., Ciocca, G., Zuffi, S., et al.: A survey of methods for colour image indexing and retrieval in image databases. In: Color Imaging Science: Exploiting Digital Media (2001)Google Scholar
  13. 13.
    Schindler, A., Rauber, A.: A music video information retrieval approach to artist identification. In: 10th Symp. on Computer Music Multidisciplinary Research (2013)Google Scholar
  14. 14.
    Tzanetakis, G., Cook, P.: Marsyas: A framework for audio analysis. Organised Sound (2000)Google Scholar
  15. 15.
    Valdez, P., Mehrabian, A.: Effects of color on emotions. Journal of Experimental Psychology: General 123(4), 394 (1994)CrossRefGoogle Scholar
  16. 16.
    Vedaldi, A., Soatto, S.: Quick shift and kernel methods for mode seeking. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 705–718. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Wei-ning, W., Ying-lin, Y., Sheng-ming, J.: Image retrieval by emotional semantics: A study of emotional space and feature extraction. In: IEEE International Conference on Systems, Man and Cybernetics (2006)Google Scholar
  18. 18.
    Wildenauer, H., Blauensteiner, P., Hanbury, A., Kampel, M.: Motion detection using an improved colour model. In: Bebis, G., et al. (eds.) ISVC 2006. LNCS, vol. 4292, pp. 607–616. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  19. 19.
    Yazdani, A., Kappeler, K., Ebrahimi, T.: Affective content analysis of music video clips. In: Music Information Retrieval with User-Centered and Multimodal Strategies (2011)Google Scholar
  20. 20.
    Zhang, S., Huang, Q., Jiang, S., Gao, W., Tian, Q.: Affective visualization and retrieval for music video. IEEE Transactions on Multimedia 12(6), 510–522 (2010)CrossRefGoogle Scholar
  21. 21.
    Zuiderveld, K.: Contrast limited adaptive histogram equalization. In: Graphics Gems IV, pp. 474–485. Academic Press Professional, Inc. (1994)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Alexander Schindler
    • 1
    • 2
  • Andreas Rauber
    • 1
  1. 1.Department of Software Technology and Interactive SystemsVienna University of TechnologyAustria
  2. 2.Information management, Digital Safety and Security DepartmentAIT Austrian Institute of TechnologyAustria

Personalised recommendations