Indexing music by mood: design and integration of an automatic content-based annotator

Laurier, Cyril; Meyers, Owen; Serrà, Joan; Blech, Martin; Herrera, Perfecto; Serra, Xavier

doi:10.1007/s11042-009-0360-2

Indexing music by mood: design and integration of an automatic content-based annotator

Published: 02 October 2009

Volume 48, pages 161–184, (2010)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Cyril Laurier¹,
Owen Meyers¹,
Joan Serrà¹,
Martin Blech¹,
Perfecto Herrera¹ &
…
Xavier Serra¹

742 Accesses
28 Citations
Explore all metrics

Abstract

In the context of content analysis for indexing and retrieval, a method for creating automatic music mood annotation is presented. The method is based on results from psychological studies and framed into a supervised learning approach using musical features automatically extracted from the raw audio signal. We present here some of the most relevant audio features to solve this problem. A ground truth, used for training, is created using both social network information systems (wisdom of crowds) and individual experts (wisdom of the few). At the experimental level, we evaluate our approach on a database of 1,000 songs. Tests of different classification methods, configurations and optimizations have been conducted, showing that Support Vector Machines perform best for the task at hand. Moreover, we evaluate the algorithm robustness against different audio compression schemes. This fact, often neglected, is fundamental to build a system that is usable in real conditions. In addition, the integration of a fast and scalable version of this technique with the European Project PHAROS is discussed. This real world application demonstrates the usability of this tool to annotate large-scale databases. We also report on a user evaluation in the context of the PHAROS search engine, asking people about the utility, interest and innovation of this technology in real world use cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Labeling data and developing supervised framework for hindi music mood analysis

Article 10 November 2016

Braja Gopal Patra, Dipankar Das & Sivaji Bandyopadhyay

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

From Personalized to Hierarchically Structured Classifiers for Retrieving Music by Mood

Notes

In psychology, the term valence describes the attractiveness or aversiveness of an event, object or situation. For instance happy and joy have a positive valence and anger and fear a negative valence.
http://www.music-ir.org/mirex2007/index.php/Audio_Music_Mood_Classification
http://trec.nist.gov/
http://www-nlpir.nist.gov/projects/trecvid/
http://www.last.fm
Wordnet is a large lexical database of English words with sets of synonyms http://wordnet.princeton.edu/
http://www.pharos-audiovisual-search.eu
http://www.webratio.com

References

Andric A, Haus G (2006) Automatic playlist generation based on tracking user’s listening habits. Multimed Tools Appl 29(2):127–151
Article Google Scholar
Berenson ML, Goldstein M, Levine D (1983) Intermediate statistical methods and applications: a computer package approach. Prentice-Hall
Bigand E, Vieillard S, Madurell F, Marozeau J, Dacquet A (2005) Multidimensional scaling of emotional responses to music: The effect of musical expertise and of the duration of the excerpts. Cognition & Emotion 19(8):1113–1139
Article Google Scholar
Boser BE, Guyon, IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In COLT '92: Proceedings of the fifth annual workshop on Computational learning theory. ACM, New York, pp 144–152
Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Content-based music information retrieval: Current directions and future challenges. Proc IEEE 96(4):668–696
Article Google Scholar
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc, B 39(1):1–38
MATH MathSciNet Google Scholar
Downie JS (2008) The music information retrieval evaluation exchange (2005–2007): a window into music information retrieval research. Acoust Sci Technol 29(4):247–255
Article Google Scholar
Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, Somerset
MATH Google Scholar
Farnsworth PR (1954) A study of the Hevner adjective list. J Aesthet Art Crit 13(1):97–103
Article Google Scholar
Gómez E (2006) Tonal description of music audio signals. PhD thesis, Universitat Pompeu Fabra
Gouyon F, Herrera P, Gómez E, Cano P, Bonada J, Loscos A, Amatriain X, Serra X (2008) Content Processing of Music Audio Signals, chapter 3, pages 83–160. Logos Verlag Berlin GmbH, Berlin
Hevner K (1936) Experimental studies of the elements of expression in music. Am J Psychol 58:246–268
Article Google Scholar
Hu X, Downie JS, Laurier C, Bay M, Ehmann AF (2008) The 2007 MIREX audio mood classification task: Lessons learned. In Proceedings of the 9th International Conference on Music Information Retrieval, pp 462–467, Philadelphia, PA, USA, 2008
Juslin PN, Laukka P (2004) Expression, perception, and induction of musical emotions: A review and a questionnaire study of everyday listening. Journal of New Music Research, 33(3)
Juslin PN, Västfjäll D (2008) Emotional responses to music: the need to consider underlying mechanisms. Behavioral and Brain Sciences, 31 (5)
Krumhansl CL (1997) An exploratory study of musical emotions and psychophysiology. Can J Exp Psychol 51(4):336–353
Google Scholar
Laurier C, Herrera P (2007) Audio music mood classification using support vector machine. Music Information Retrieval Evaluation eXchange (MIREX) extended abstract
Laurier C, Herrera P (2009) Automatic detection of emotion in music: interaction with emotionally sensitive machines. Handbook of Research on Synthetic Emotions and Sociable Robotics. IGI Global, pp 9–32
Laurier C, Grivolla J, Herrera P (2008) Multimodal music mood classification using audio and lyrics. In Proceedings of the International Conference on Machine Learning and Applications. San Diego, CA, USA
Le Cessie S, Van Houwelingen JC (1992) Ridge estimators in logistic regression. Appl Stat 41(1):191–201
Article MATH Google Scholar
Li T, Ogihara M (2003) Detecting emotion in music. In Proceedings of the 4th International Conference on Music Information Retrieval, pages 239–240, Baltimore, MD, USA
Lidy T, Rauber A, Pertusa A, Iñesta JM (2007) MIREX 2007: combining audio and symbolic descriptors for music classification from audio. MIREX 2007 — music information retrieval evaluation eXchange, Vienna, Austria, September 23–27, 2007
Lindström E (1997) Impact of melodic structure on emotional expression. In Proceedings of the Third Triennial ESCOM Conference, pp 292–297
Logan B (2000) Mel frequency cepstral coefficients for music modeling. In Proceeding of the 1st International Symposium on Music Information Retrieval, Plymouth, MA, USA, 2000
Lu D, Liu L, Zhang H (2006) Automatic mood detection and tracking of music audio signals. IEEE Trans Audio Speech Lang Process 14(1):5–18
Article MathSciNet Google Scholar
Mandel M, Ellis DP (2007) Labrosa’s audio music similarity and classification submissions. MIREX 2007 — Music Information Retrieval Evaluation eXchange, Vienna, Austria, September 23–27, 2007
Mandel M, Poliner GE, Ellis DP (2006) Support vector machine active learning for music retrieval. Multimedia Systems, 12(1)
Orio N (2006) Music retrieval: a tutorial and review. Found Trends Inf Retr 1(1):1–96
Article Google Scholar
Pachet F, Roy P (2009) Analytical features: a knowledge-based approach to audio feature generation. EURASIP Journal on Audio, Speech, and Music Processing (1)
Peeters G (2004) A large set of audio features for sound description (similarity and classification) in the CUIDADO project. Tech. rep., IRCAM
Peretz I, Gagnon L, Bouchard B (1998) Music and emotion: perceptual determinants, immediacy, and isolation after brain damage. Cognition 68(2):111–141
Article Google Scholar
Quinlan RJ (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc, San Francisco
Google Scholar
Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39(6):1161–1178
Article Google Scholar
Sethares WA (1998) Tuning timbre spectrum scale. Springer-Verlag
Shi YY, Zhu X, Kim HG, Eom KW (2006) A tempo feature via modulation spectrum analysis and its application to music emotion classification. In Proceedings of the IEEE International Conference on Multimedia and Expo Toronto, Canada, pp 1085–1088
Skowronek J, McKinney MF, van de Par S (2007) A demonstrator for automatic music mood estimation. In Proceedings of the International Conference on Music Information Retrieval, Vienna, Austria
Smith JO, Abel JS (1999) Bark and erb bilinear transforms. IEEE Trans Speech Audio Process 7(6):697–708
Article Google Scholar
Sordo M, Laurier C, Celma O (2007) Annotating music collections: how content-based similarity helps to propagate labels. In Proceedings of the 8th International Conference on Music Information Retrieval, Vienna, Austria, pp 531–534
Thayer RE (1996) The origin of everyday moods: managing energy, tension, and stress. Oxford University Press, Oxford
Google Scholar
Tzanetakis G (2007) Marsyas-0.2: a case study in implementing music information retrieval systems. In Intelligent Music Information Systems
Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Audio Speech Lang Process 10(5):293–302
Article Google Scholar
Vieillard S, Peretz I, Gosselin N, Khalfa S, Gagnon L, Bouchard B (2008) Happy, sad, scary and peaceful musical excerpts for research on emotions. Cognition & Emotion 22(4):720–752
Article Google Scholar
Wedin L (1972) A Multidimensional study of perceptual-emotional qualities in music. Scand J Psychol 13(4):241–257
Article MathSciNet Google Scholar
Wieczorkowska A, Synak P, Lewis R, Ras Z (2005) Extracting emotions from music data. In Foundations of Intelligent Systems, Springer-Verlag, pp 456–465
Witten IH, Frank E (1999) Data mining: practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco
Google Scholar
Yang YH, Lin YC, Su YF, Chen HH (2008) A regression approach to music emotion recognition. IEEE Trans Audio Speech Lang Process 16(2):448–457
Article Google Scholar

Download references

Acknowledgments

We are very grateful to all the human annotators that helped to create our ground truth dataset. We also want to thank all the people contributing to the Music Technology Group (Universitat Pompeu Fabra, Barcelona) technologies and, in particular, Nicolas Wack, Eduard Aylon and Robert Toscano. We are also grateful to the entire MIREX team, specifically Stephen Downie and Xiao. We finally want to thank Michel Plu and Valérie Botherel from Orange Labs for the user evaluation data and Piero Fraternali, Alessandro Bozzon and Marco Brambilla from WebModels for the user interface. This research has been partially funded by the EU Project PHAROS IST-2006-045035.

Author information

Authors and Affiliations

Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain
Cyril Laurier, Owen Meyers, Joan Serrà, Martin Blech, Perfecto Herrera & Xavier Serra

Authors

Cyril Laurier
View author publications
You can also search for this author in PubMed Google Scholar
Owen Meyers
View author publications
You can also search for this author in PubMed Google Scholar
Joan Serrà
View author publications
You can also search for this author in PubMed Google Scholar
Martin Blech
View author publications
You can also search for this author in PubMed Google Scholar
Perfecto Herrera
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Serra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cyril Laurier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Laurier, C., Meyers, O., Serrà, J. et al. Indexing music by mood: design and integration of an automatic content-based annotator. Multimed Tools Appl 48, 161–184 (2010). https://doi.org/10.1007/s11042-009-0360-2

Download citation

Published: 02 October 2009
Issue Date: May 2010
DOI: https://doi.org/10.1007/s11042-009-0360-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Indexing music by mood: design and integration of an automatic content-based annotator

Abstract

Access this article

Similar content being viewed by others

Labeling data and developing supervised framework for hindi music mood analysis

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

From Personalized to Hierarchically Structured Classifiers for Retrieving Music by Mood

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Indexing music by mood: design and integration of an automatic content-based annotator

Abstract

Access this article

Similar content being viewed by others

Labeling data and developing supervised framework for hindi music mood analysis

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

From Personalized to Hierarchically Structured Classifiers for Retrieving Music by Mood

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation