MusicEmo: transformer-based intelligent approach towards music emotion generation and recognition

Xin, Ying

doi:10.1007/s12652-024-04811-0

MusicEmo: transformer-based intelligent approach towards music emotion generation and recognition

Original Research
Published: 17 May 2024

(2024)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Ying Xin¹

61 Accesses
Explore all metrics

Abstract

The paper proposes a novel approach called MusicEmo, a transformer-based intelligent system for music emotion generation and recognition. The paper highlights the challenges of creating emotionally resonant music that is musically cohesive and diverse. The proposed approach addresses this challenge by introducing a theme-based conditioning approach, which trains the transformer to manifest the conditioning sequence as thematic material that appears multiple times in the generated result. The MusicEmo architecture incorporates an emotion vector and an LSTM model for creating symbolic musical sequences that are musically coherent and emotionally resonant. The proposed framework outperforms state-of-the-art approaches based on musical consistency and emotional resonance. The transformer-based approach offers a fresh and original way of creating music based on emotions, and it can potentially revolutionize how we create and experience music in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Article Open access 13 February 2024

NeuroKit2: A Python toolbox for neurophysiological signal processing

Article 02 February 2021

Five mechanisms of sound symbolic association

Article Open access 24 August 2017

References

Agrawal Y, Shanker RGR, Alluri V (2021a) Transformer-based approach towards music emotion recognition from lyrics. In: European conference on information retrieval. Springer, pp 167–175
Agrawal Y, Shanker RGR, Alluri V (2021b) Transformer-based approach towards music emotion recognition from lyrics. In: European conference on information retrieval. Springer, pp 167–175
Bao C, Sun Q (2022) Generating music with emotions. IEEE Trans Multimed 25:3602–3614
Article Google Scholar
Boulanger-Lewandowski N, Bengio Y, Vincent P (2012) Modeling temporal dependencies in high-dimensional sequences: application to polyphonic music generation and transcription. arXiv preprint arXiv:1206.6392
Briot JP, Pachet F (2020) Deep learning for music generation: challenges and directions. Neural Comput Appl 32(4):981–993
Article Google Scholar
Briot JP, Hadjeres G, Pachet FD (2017) Deep learning techniques for music generation: a survey. arXiv preprint arXiv:1709.01620
Casey MA (1993) Computers and musical style, pp 1053–1055
Chen TP, Su L (2021) Attend to chords: improving harmonic analysis of symbolic music using transformer-based models. Trans Int Soc Music Inf 4(1):1–13
Google Scholar
Eck D, Schmidhuber J (2002) Finding temporal structure in music: Blues improvisation with LSTM recurrent networks. In: Proceedings of the 12th IEEE workshop on neural networks for signal processing. IEEE, pp 747–756
Er MB, Aydilek IB (2019) Music emotion recognition by using chroma spectrogram and deep visual features. Int J Comput Intell Syst 12(2):1622–1634
Article Google Scholar
Eyben F, Weninger F, Gross F, et al (2013) Recent developments in opensmile, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM international conference on multimedia, pp 835–838
Ferreira LN, Whitehead J (2021) Learning to generate music with sentiment. arXiv preprint arXiv:2103.06125
Genussov M, Cohen I (2010) Musical genre classification of audio signals using geometric methods. In: 2010 18th European signal processing conference. IEEE, pp 497–501
Gómez-Cañón JS, Cano E, Eerola T et al (2021) Music emotion recognition: toward new, robust standards in personalized and context-sensitive applications. IEEE Signal Process Mag 38(6):106–114
Article Google Scholar
Hizlisoy S, Yildirim S, Tufekci Z (2021) Music emotion recognition using convolutional long short term memory deep neural networks. Eng Sci Technol Int J 24(3):760–767
Google Scholar
Hsu JL, Chang SJ (2021) Generating music transition by using a transformer-based model. Electronics 10(18):2276
Article Google Scholar
Hsu JL, Chang SJ (2021) Generating music transition by using a transformer-based model. Electronics 10(18):2276
Article Google Scholar
Hung HT, Ching J, Doh S et al (2021) Emopia: a multi-modal pop piano dataset for emotion recognition and emotion-based music generation. arXiv preprint arXiv:2108.01374
Ishizuka K, Onisawa T et al (2008) Generation of variations on theme music based on impressions of story scenes considering human’s feeling of music and stories. Int J Comput Games Technol 2008:281959
Article Google Scholar
Kagan S, Kirchberg V (2016) Music and sustainability: organizational cultures towards creative resilience: a review. J Clean Prod 135:1487–1502
Article Google Scholar
Latif S, Zaidi A, Cuayahuitl H et al (2023) Transformers in speech processing: a survey. arXiv preprint arXiv:2303.11607
Lau DS, Ajoodha R (2022) Music genre classification: a comparative study between deep learning and traditional machine learning approaches. In: Proceedings of sixth international congress on information and communication technology: ICICT 2021, London, vol 4. Springer, pp 239–247
Sams AS, Zahra A (2023) Multimodal music emotion recognition in Indonesian songs based on CNN-LSTM, XLNET transformers. Bull Electr Eng Inform 12(1):355–364
Article Google Scholar
Shih YJ, Wu SL, Zalkow F et al (2022) Theme transformer: symbolic music generation with theme-conditioned transformer. IEEE Trans Multimed 25:3495–3507
Article Google Scholar
Turchet L, Lagrange M, Rottondi C et al (2023) The internet of sounds: convergent trends, insights and future directions. IEEE Internet Things J 10:11264
Article Google Scholar
Wu SL, Yang YH (2020) The Jazz transformer on the front line: Exploring the shortcomings of ai-composed music through quantitative measures. arXiv preprint arXiv:2008.01307
Yang LC, Chou SY, Yang YH (2017) Midinet: a convolutional generative adversarial network for symbolic-domain music generation. arXiv preprint arXiv:1703.10847
Zheng K, Meng R, Zheng C et al (2021) Emotionbox: a music-element-driven emotional music generation system using recurrent neural network. arXiv preprint arXiv:2112.08561

Download references

Author information

Authors and Affiliations

Luoyang Institute of Science and Technology, Luoyang, China
Ying Xin

Authors

Ying Xin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Xin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xin, Y. MusicEmo: transformer-based intelligent approach towards music emotion generation and recognition. J Ambient Intell Human Comput (2024). https://doi.org/10.1007/s12652-024-04811-0

Download citation

Received: 25 May 2023
Accepted: 09 May 2024
Published: 17 May 2024
DOI: https://doi.org/10.1007/s12652-024-04811-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MusicEmo: transformer-based intelligent approach towards music emotion generation and recognition

Abstract

Access this article

Similar content being viewed by others

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

NeuroKit2: A Python toolbox for neurophysiological signal processing

Five mechanisms of sound symbolic association

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MusicEmo: transformer-based intelligent approach towards music emotion generation and recognition

Abstract

Access this article

Similar content being viewed by others

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

NeuroKit2: A Python toolbox for neurophysiological signal processing

Five mechanisms of sound symbolic association

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation