Abstract
This paper presents a novel dataset of songs by non-superstar artists in which a set of musical data is collected, identifying for each song its musical structure, and the emotional perception of the artist through a categorical emotional labeling process. The generation of this preliminary dataset is motivated by the existence of biases that have been detected in the analysis of the most used datasets in the field of emotion-based music recommendation. This new dataset contains 234 min of audio and 60 complete and labeled songs. In addition, an emotional analysis is carried out based on the representation of dynamic emotional perception through a time-series approach, in which the similarity values generated by the dynamic time warping (DTW) algorithm are analyzed and then used to implement a clustering process with the K-means algorithm. In the same way, clustering is also implemented with a Uniform Manifold Approximation and Projection (UMAP) technique, which is a manifold learning and dimension reduction algorithm. The algorithm HDBSCAN is applied for determining the optimal number of clusters. The results obtained from the different clustering strategies are compared and, in a preliminary analysis, a significant consistency is found between them. With the findings and experimental results obtained, a discussion is presented highlighting the importance of working with complete songs, preferably with a well-defined musical structure, considering the emotional variation that characterizes a song during the listening experience, in which the intensity of the emotion usually changes between verse, bridge, and chorus.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Available at: https://github.com/yesidospitiamedina/ENSA.
Available at: http://104.237.5.250/evaluacionensa/form.php.
Available at: https://github.com/yesidospitiamedina/ENSA.
References
Abdollahpouri H, Mansoury M (2020) Multi-sided exposure bias in recommendation http://arxiv.org/abs/2006.15772, 2006.15772
Aucouturier J, Bigand E (2012) Mel cepstrum & ann ova: the difficult dialog between MIR and music cognition. In: Gouyon F, Herrera P, Martins LG, Müller M (eds) Proceedings of the 13th International Society for Music Information Retrieval Conference, ISMIR 2012, Mosteiro S.Bento Da Vitória, Porto, Portugal, October 8-12, 2012, FEUP Editorial, 2012, http://ismir2012.ismir.net/event/papers/397-ismir-2012.pdf pp 397–402
Bachorik JP, Bangert M, Loui P, Larke K, Berger J, Rowe R, Schlaug G (2009) Emotion in Motion: Investigating the Time-Course of Emotional Judgments of Musical Stimuli. Music Perception 26(4):355–364. https://doi.org/10.1525/mp.2009.26.4.355
Bauer C, Kholodylo M, Strauss C (2017) Music recommender systems challenges and opportunities for non-superstar artists. In: Digital Transformation - From Connecting Things to Transforming Our Lives, University of Maribor Press, Bled, pp 21–32, 10.18690/978-961-286-043-1.3
Bertin-Mahieux T, Ellis DP, Whitman B, Lamere P (2011) The million song dataset. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011)
Bobadilla J, Ortega F, Hernando A, Gutiérrez A (2013) Recommender systems survey. Knowledge-Based Systems 46:109–132, DOI: 10.1016/j.knosys.2013.03.012
Bogdanov D, Won M, Tovstogan P, Porter A, Serra X (2019) The MTG-Jamendo dataset for automatic music tagging
Celma O (2010) Music recommendation and discovery in the long tail. Springer, Barcelona. https://doi.org/10.1007/978-3-642-13287-2
Chamberlain A, Crabtree A (2016) Searching for Music: Understanding the Discovery, Acquisition, Processing and Organization of Music in a Domestic Setting for Design. Personal Ubiquitous Comput 20(4):559–571. https://doi.org/10.1007/s00779-016-0911-2
Chen J, Ying P, Zou M (2019) Improving music recommendation by incorporating social influence. Multimedia Tools and Applications 78(3):2667–2687. https://doi.org/10.1007/s11042-018-5745-7
Costa BG, Freire JCA, Cavalcante HS, Homci M, Castro ARG, Viegas R, Meiguins BS, Morais JM (2017) Fault classification on transmission lines using knn-dtw. In: Gervasi O, Murgante B, Misra S, Borruso G, Torre CM, Rocha AMA, Taniar D, Apduhan BO, Stankova E, Cuzzocrea A (eds) Computational Science and Its Applications - ICCSA 2017. Springer International Publishing, Cham, pp 174–187
Cuturi M, Blondel M (2017) Soft-DTW: A differentiable loss function for time-series. 34th International Conference on Machine Learning, ICML 2017 2:1483–1505, http://arxiv.org/abs/1703.01541v2
De Maesschalck R, Jouan-Rimbaud D, Massart DL (2000) The mahalanobis distance. Chemometrics and intelligent laboratory systems 50(1):1–18
Deshmukh P, Kale G (2018) A survey of music recommendation system. In: International Journal of Scientific Research in Computer Science, vol 3, p 27
Eerola T, Vuoskoski JK (2013) A Review of Music and Emotion Studies: Approaches, Emotion Models, and Stimuli. Music Perception 30(3):307–340. https://doi.org/10.1525/mp.2012.30.3.307
Fan J, Yang YH, Dong K, Pasquier P (2020) A comparative study of Western and Chinese classical music based on soundscape models. In: 45th International Conference on Acoustics, Speech, and Signal Processing, IEEE, Barcelona
Fessahaye F, Perez L, Zhan T, Zhang R, Fossier C, Markarian R, Chiu C, Zhan J, Gewali L, Oh P (2019) T-RECSYS: a novel music recommendation system using deep learning. In: 2019 IEEE International Conference on Consumer Electronics (ICCE), IEEE, YILAN, pp 1–6 https://doi.org/10.1109/ICCE.2019.8662028
Frejman AE, Johansson D (2008) Emerging and conflicting business models for music content in the digital environment. In: eChallenges e-2008, IOS Press, Stockholm
Friedman B (1996) Bias in computer systems. ACM Transactions on Information Systems 14(3), 330–347, DOI: 10.1145/230538.230561
Gabrielsson A, Lindström E (2010) The role of structure in the musical expression of emotions. Handbook of music and emotion: Theory, research, applications pp 367–400, https://doi.org/10.1093/acprof:oso/9780199230143.003.0014
Gemmeke JF, Ellis DPW, Freedman D, Jansen A, Lawrence W, Moore RC, Plakal M, Ritter M (2017) Audio set: an ontology and human-labeled dataset for audio events. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 776–780 https://doi.org/10.1109/ICASSP.2017.7952261
Gouyon F, Klapuri A, Dixon S, Alonso M, Tzanetakis G, Uhle C, Cano P (2006) An experimental comparison of audio tempo induction algorithms. IEEE Transactions on Audio, Speech, and Language Processing 14(5), 1832–1844, DOI: 10.1109/TSA.2005.858509
Hesmondhalgh D (2021) Is music streaming bad for musicians? Problems of evidence and argument. New Media & Society 23(12):3593–3615. https://doi.org/10.1177/1461444820953541
IFPI (2021) Global Music Report 2021. Tech. rep., IFPI, London
Jin Y, Htun NN, Tintarev N, Verbert K (2019) Contextplay. In: Proceedings of the 27th ACM Conference on User Modeling, Adaptation and Personalization, ACM, New York, pp 294–302 https://doi.org/10.1145/3320435.3320445
Juslin P, Juslin PN, Sloboda J, Sloboda P, Frijda N (2010) Handbook of music and emotion: theory, research, applications. Affective Science, OUP Oxford, Oxford, https://books.google.com.co/books?id=t8j5pduTkboC
Juslin PN, Harmat L, Eerola T (2014) What makes music emotionally significant? exploring the underlying mechanisms. Psychology of Music 42(4):599–623
Katarya R, Verma OP (2018) Efficient music recommender system using context graph and particle swarm. Multimedia Tools and Applications 77(2), 2673–2687, DOI: 10.1007/s11042-017-4447-x
Law E, West K, Mandel M, Bay M, Downie JS (2009) Evaluation of algorithms using games : the case of music tagging. In: In Proc. wISMIR 2009
Lee JH, Downie JS (2004) Survey of music information needs, uses, and seeking behaviours: Preliminary findings. ISMIR 2004:441–446
Mesaros A, Heittola T, Virtanen T (2016) Tut database for acoustic scene classification and sound event detection. In: 2016 24th European Signal Processing Conference (EUSIPCO), pp 1128–1132 https://doi.org/10.1109/EUSIPCO.2016.7760424
Nielzen S, Cesarec Z (1982) Emotional experience of music as a function of musical structure. Psychology of Music 10(2):7–17
Ospitia-Medina Y, Baldassarri S, Beltrán JR (2019a) High-level libraries for emotion recognition in music: a review. In: Agredo V, Ruiz P (eds) Human-Computer Interaction. HCI-COLLAB 2018., Springer, Popayán, pp 158–168 https://doi.org/10.1007/978-3-030-05270-6_12
Ospitia-Medina Y, Beltrán JR, Sanz C, Baldassarri S (2019b) Dimensional emotion prediction through low-level musical features. In: ACM (ed) Audio Mostly (AM’19), Nottingham, p 4, https://doi.org/10.1145/3356590.3356626
Ospitia-Medina Y, Beltrán JR, Baldassarri S (2020) Emotional classification of music using neural networks with the MediaEval dataset. Personal and Ubiquitous Computing 10.1007/s00779-020-01393-4
Ospitia-Medina Y, Baldassarri S, Sanz C, Beltrán JR (2022) Music recommender systems: a review centered on biases (In press). Advances in Speech and Music Technology: Computational Aspects and Applications
Paul D, Kundu S (2020) A survey of music recommendation systems with a proposed music recommendation system. Advances in Intelligent Systems and Computing, vol 937, Springer Singapore, Singapore, pp 279–285 https://doi.org/10.1007/978-981-13-7403-6_26
Piczak KJ (2015) Esc: dataset for environmental sound classification. Association for Computing Machinery, New York, NY, USA, MM ’15, p 1015-1018 https://doi.org/10.1145/2733373.2806390
Russell JA (1980) A circumplex model of affect. Journal of Personality and Social Psychology 39(6):1161–1178
Salamon J, Jacoby C, Bello JP (2014) A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM International Conference on Multimedia, Association for Computing Machinery, New York, NY, USA, MM ’14, p 1041–1044 https://doi.org/10.1145/2647868.2655045
Schedl M, Zamani H, Chen CW, Deldjoo Y, Elahi M (2018) Current Challenges and Visions in Music Recommender Systems Research. International Journal of Multimedia Information Retrieval 7(2):95–116. https://doi.org/10.1007/s13735-018-0154-2
Schubert E (2004) Modeling perceived emotion with continuous musical features. Music Perception 21:561–585
Semeraro A, Vilella S, Ruffo G (2021) Pyplutchik: Visualising and comparing emotion-annotated corpora. PLOS ONE 16(9):1–24. https://doi.org/10.1371/journal.pone.0256503
Shah F, Desai M, Pati S, Mistry V (2020) Hybrid music recommendation system based on temporal effects. In: Advances in Intelligent Systems and Computing, vol 1034, pp 569–577, DOI: 10.1007/978-981-15-1084-7_55
Sloboda J (1986) The musical mind, oxford psy edn. Oxford University Press, New York, https://doi.org/10.1093/acprof:oso/9780198521280.001.0001
Sloboda J (1991) Music structure and emotional response: Some empirical findings. Psychology of music 19:110–120. https://doi.org/10.1177/0305735691192002
Soleymani M, Aljanaki A, Yang YH (2016) DEAM: MediaEval database for emotional analysis in music pp 3–5, http://cvml.unige.ch/databases/DEAM/manual.pdf
Tavenard R, Faouzi J, Vandewiele G, Divo F, Androz G, Holtz C, Payne M, Yurchak R, Rußwurm M, Kolar K, Woods E (2020) Tslearn, a machine learning toolkit for time series data. Journal of Machine Learning Research 21:1–6
Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10(5), 293–302, DOI: 10.1109/TSA.2002.800560
Yang S, Reed CN, Chew E, Barthet M (2021) Examining emotion perception agreement in live music performance. IEEE Transactions on Affective Computing pp 1–1 https://doi.org/10.1109/TAFFC.2021.3093787
Yang Yh, Chen HH (2012) Machine recognition of music emotion. ACM Transactions on Intelligent Systems and Technology 3(3):1–30. https://doi.org/10.1145/2168752.2168754
Zamani H, Schedl M, Lamere P, Chen C (2019) An analysis of approaches taken in the ACM recsys challenge 2018 for automatic music playlist continuation. ACM Trans Intell Syst Technol 10(5):57:1–57:21. https://doi.org/10.1145/3344257
Zhang K, Zhang H, Li S, Yang C, Sun L (2018) The PMEmo dataset for music emotion recognition. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, Association for Computing Machinery, New York, NY, USA, ICMR ’18, p 135–142 https://doi.org/10.1145/3206025.3206037
Zheng HT, Chen JY, Liang N, Sangaiah A, Jiang Y, Zhao CZ (2019) A Deep Temporal Neural Music Recommendation Model Utilizing Music and User Metadata. Applied Sciences 9(4):703. https://doi.org/10.3390/app9040703
Acknowledgements
We express our special gratitude to each one of the artists and bands that made this research possible. Bajo Cuerda, Psicophony, Madriguera, Skaparate, Kimberly Aguiar, Víctor Roll, Resistencia al Olvido, Lina Cardona Herrera, Denisse Rocío García Lozano, and Atadura.
Funding
This research has been partially supported by the Spanish Ministry of Science, Innovation and Universities through project RTI2018-096986-B-C31 and the Aragonese Government through the AffectiveLab-T60-23R project. It has also been partially supported by the Computer Science School of the National University of La Plata (UNLP) through the Ph.D. program in Computer Science. This work is part of the project Technological Ecosystem for the MOod Recognition in cardiac rehabilitation patients (TEMOR), TED2021-130374B-C22, funded by MCIN/AEI/ 10.13039/501100011033 and by European Union NextGenerationEU/PRTR.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ospitia-Medina, Y., Beltrán, J.R. & Baldassarri, S. ENSA dataset: a dataset of songs by non-superstar artists tested with an emotional analysis based on time-series. Pers Ubiquit Comput 27, 1909–1925 (2023). https://doi.org/10.1007/s00779-023-01721-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-023-01721-4