Multimedia Tools and Applications

, Volume 47, Issue 3, pp 433–460 | Cite as

Music emotion classification and context-based music recommendation

  • Byeong-jun Han
  • Seungmin Rho
  • Sanghoon Jun
  • Eenjun Hwang


Context-based music recommendation is one of rapidly emerging applications in the advent of ubiquitous era and requires multidisciplinary efforts including low level feature extraction and music classification, human emotion description and prediction, ontology-based representation and recommendation, and the establishment of connections among them. In this paper, we contributed in three distinctive ways to take into account the idea of context awareness in the music recommendation field. Firstly, we propose a novel emotion state transition model (ESTM) to model human emotional states and their transitions by music. ESTM acts like a bridge between user situation information along with his/her emotion and low-level music features. With ESTM, we can recommend the most appropriate music to the user for transiting to the desired emotional state. Secondly, we present context-based music recommendation (COMUS) ontology for modeling user’s musical preferences and context, and for supporting reasoning about the user’s desired emotion and preferences. The COMUS is music-dedicated ontology in OWL constructed by incorporating domain-specific classes for music recommendation into the Music Ontology, which includes situation, mood, and musical features. Thirdly, for mapping low-level features to ESTM, we collected various high-dimensional music feature data and applied nonnegative matrix factorization (NMF) for their dimension reduction. We also used support vector machine (SVM) as emotional state transition classifier. We constructed a prototype music recommendation system based on these features and carried out various experiments to measure its performance. We report some of the experimental results.


Emotion state transition model Music information retrieval Mood Emotion Classification Recommendation 



This work was supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD). (KRF-2007-313-D00758)


  1. 1.
    Allwein E, Schapire R, Singer Y (2000) Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers. Journal of Machine Learning Research 1:113–141CrossRefMathSciNetGoogle Scholar
  2. 2.
    Boser BE, Guyon I and Vapnik V (1992) A training algorithm for optimal margin classifiers. In proceedings of the Fifth Annual Workshop on Computational Learning Theory : 144–152, ACM PressGoogle Scholar
  3. 3.
    Chang C-C and Lin C-J (2001) LIBSVM: a library for support vector machines, Software available at of subordinate document. Access 30 Jun 2008.
  4. 4.
    Cord M, Cunningham P (2008) Machine Learning Techniques for Multimedia. Springer-Verlag Berlin Heidelberg.Google Scholar
  5. 5.
    Cortes C, Vapnik V (1995) Support-vector network. Machine Learning 20:273–297MATHGoogle Scholar
  6. 6.
    Ellis D, and Poliner G (2007) Identifying `Cover Songs’ with Chroma Features and Dynamic Programming Beat Tracking. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007) 4:1429–1432Google Scholar
  7. 7.
    Feng Yazhong , Zhuang, Yueting, Pan Yunhe. (2003) Popular Music Retrieval by Detecting Mood. SIGIR Forum (ACM Special Interest Group on Information Retrieval), Page(s) : 375–376.Google Scholar
  8. 8.
    Grüninger M, Fox MS (1994) The Role of Mariano Fernández López 4–12 Competency Questions in Enterprise Engineering. IFIP WG 5.7 Workshop on Benchmarking. Theory and Practice. Trondheim, Norway.Google Scholar
  9. 9.
    Han B, Hwang E, Rho S, Kim M (2007) M-MUSICS: Mobile Content-based Music Retrieval System. ACM Multimedia 2007:496–497Google Scholar
  10. 10.
    Holzapfel A, Stylianou Y (2008) Musical Genre Classification Using Nonnegative Matrix Factorization-Based Features. IEEE Transactions on Audio, Speech, and Language Processing 16(2):424–434CrossRefGoogle Scholar
  11. 11.
    ISO 226 :2003 Acoustics — Normal equal-loudness level contours.Google Scholar
  12. 12.
    Jun S, Rho S, Han B, Hwang E (2008) A Fuzzy Inference-based Music Emotion Recognition System. IEEE International Conferences on Visual Information Engineering 2008 (VIE ’08), to appear in Jul. 2008Google Scholar
  13. 13.
    Juslin PN (2000) Cue utilization in communication of emotion in music performance: Relating performance to perception. J. Experimental Psychology 26:1797–1813Google Scholar
  14. 14.
    Kalat JW and Shiota MN (2007) Emotion. Thomson. 1/e.Google Scholar
  15. 15.
    Klapuri A(ed.) and Davy, M(ed.) (2006) Signal Processing Methods for Music Transcription. Springer Science + Business Media LLC.Google Scholar
  16. 16.
    Krumhansl C (1990) Cognitive Foundations of Musical Pitch. Oxford University Press.Google Scholar
  17. 17.
    Lee DD, Seung S (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791CrossRefGoogle Scholar
  18. 18.
    Lee DD and Seung S (2001) Algorithms for Non-negative Matrix Factorization. Advances in Neural Information Processing System 13 : Proceedings of the 2000 Conference. 556–562. MIT Press.Google Scholar
  19. 19.
    Lu L, Liu D, Zhang H-J (2006) Automatic MoodDetection and Tracking of Music Audio Signals. IEEE Transactions on Audio, Speech, and Language Processing 14(1):5–18CrossRefMathSciNetGoogle Scholar
  20. 20.
    Müller M (2008) Information Retrieval for Music and Motion. Springer Berlin Heidelberg New YorkGoogle Scholar
  21. 21.
    Oscar C (2006) Foafing the Music: Bridging the semantic gap in music recommendation. Proceedings of 5th International Semantic Web Conference (ISWC)Google Scholar
  22. 22.
    Oscar C, Perfecto H, and Xavier S (2006) A multimodal approach to bridge the Music Semantic Gap. International Conference on Semantic and Digital Media Technologies (SAMT)Google Scholar
  23. 23.
    Pachet F, Casaly D (2000) “A taxonomy of musical genres,” in Proceedings of the 6th Conference on Content-Based Multimedia Information Access (RIAO’00). France, April, ParisGoogle Scholar
  24. 24.
    Rabiner LR, Juang B-H (1993) Fundamentals of Speech Recognition. Prentice Hall, New JerseyGoogle Scholar
  25. 25.
    Rho S, Han B, Hwang E, Kim M (2008) MUSEMBLE: A Novel Music Retrieval System with Automatic Voice Query Transcription and Reformulation. Journal of Systems and Software 81(7):1065–1080CrossRefGoogle Scholar
  26. 26.
    Robinson DW et al (1956) A re-determination of the equal-loudness relations for pure tones. British Journal of Applied Physics 7:166–181CrossRefGoogle Scholar
  27. 27.
    Russel JA (1980) A circumplex model of affect. Journal of Personality Social Psychology 39:1161–1178CrossRefGoogle Scholar
  28. 28.
    Scherer K (1992) What does facial expression express ? In K.T. Strongman(Ed.) International Review of Studies on Emotions 2 :139–165. Chichester : Wiley.Google Scholar
  29. 29.
    Scherer KR (2005) What are emotions? And how can they be measures? Social Science Information 44(4):695–729CrossRefGoogle Scholar
  30. 30.
    Schölkopf B, Platt JC et al (1999) Estimating the support of a high-dimensional distribution. Microsoft research corporation technical report MSR-TR-99-87.Google Scholar
  31. 31.
    Thayer RE (1989) The Biopsychology of Mood and Arousal. Oxford Univ. Press, Oxford, U.KGoogle Scholar
  32. 32.
    Xiang H, Ren F, Kuroiwa S, Jiang P (2005) An Experimentation on Creating a Mental State Transition Network. Proceedings of the 2005 IEEE International Confernece on Information Acquisition:432–436Google Scholar
  33. 33.
    Yazhong Feng; Yueting Zhuang; Yunhe Pan. (2003) Music information retrieval by detecting mood via computational media aesthetics. Web Intelligence, Proceedings. IEEE/WIC International Conference onVolume, Issue, 13–17. Page(s): 235 – 241.Google Scholar
  34. 34.
    Yves R and Frederick G (2007) Music Ontology Specification,
  35. 35.
    Yves R, Samer A, Mark S, Frederick G (2007) The Music Ontology. Proceedings of the International Conference on Music Information Retrieval, ISMIR 2007:417–422Google Scholar
  36. 36.
    Zwicker E, Fastl H (1990) Psychoacoustics — Facts and Models. (1st Ed.) Springer.Google Scholar
  37. 37.
    Kanzaki Music Vocabulary.
  38. 38.
  39. 39.
    Music Ontology Specification,
  40. 40.
  41. 41.
    All Music Guide,
  42. 42.
  43. 43.
    Protégé editor,
  44. 44.
    Mood Logic, Available at:

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Byeong-jun Han
    • 1
  • Seungmin Rho
    • 2
  • Sanghoon Jun
    • 1
  • Eenjun Hwang
    • 1
  1. 1.School of Electrical EngineeringKorea UniversitySeoulKorea
  2. 2.School of Computer ScienceCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations