The Chordinator: Modeling Music Harmony by Implementing Transformer Networks and Token Strategies

Dalmazzo, David; Déguernel, Ken; Sturm, Bob L. T.

doi:10.1007/978-3-031-56992-0_4

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14633))

Included in the following conference series:

International Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar)

112 Accesses

Abstract

This paper compares two tokenization strategies for modeling chord progressions using the encoder transformer architecture trained with a large dataset of chord progressions in a variety of styles. The first strategy includes a tokenization method treating all different chords as unique elements, which results in a vocabulary of 5202 independent tokens. The second strategy expresses the chords as a dynamic tuple describing root, nature (e.g., major, minor, diminished, etc.), and extensions (e.g., additions or alterations), producing a specific vocabulary of 59 tokens related to chords and 75 tokens for style, bars, form, and format. In the second approach, MIDI embeddings are added into the positional embedding layer of the transformer architecture, with an array of eight values related to the notes forming the chords. We propose a trigram analysis addition to the dataset to compare the generated chord progressions with the training dataset, which reveals common progressions and the extent to which a sequence is duplicated. We analyze progressions generated by the models comparing HITS@k metrics and human evaluation of 10 participants, rating the plausibility of the progressions as potential music compositions from a musical perspective. The second model reported lower validation loss, better metrics, and more musical consistency in the suggested progressions.

This paper is an outcome of: MUSAiC, a project that has received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 864189); and a Margarita Salas Grant, UPF, Barcelona, Spain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

References

Agostinelli, A., et al.: Musiclm: generating music from text. arXiv preprint arXiv:2301.11325 (2023)
Ali, M., et al.: Bringing light into the dark: a large-scale evaluation of knowledge graph embedding models under a unified framework. IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 8825–8845 (2021)
Article Google Scholar
Briot, J.P., Hadjeres, G., Pachet, F.: Deep Learning Techniques for Music Generation. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-70163-9
Book Google Scholar
Caillon, A., Esling, P.: Rave: a variational autoencoder for fast and high-quality neural audio synthesis. ArXiv e-prints (2021)
Google Scholar
Chen, T.P., Su, L.: Attend to chords: improving harmonic analysis of symbolic music using transformer-based models. Trans. Int. Soc. Music Inf. Retrieval 4(1) (2021)
Google Scholar
Choi, K., Fazekas, G., Sandler, M.: Text-based LSTM networks for automatic music composition. In: Proceedings of 1st Conference on Computer Simulation of Musical Creativity. Huddersfield, UK (2016)
Google Scholar
Huang, C.Z.A., Duvenaud, D., Gajos, K.Z.: Chordripple: recommending chords to help novice composers go beyond the ordinary. In: Proceedings of the 21st International Conference on Intelligent User Interfaces, pp. 241–250 (2016)
Google Scholar
Huang, C.Z.A., et al.: Music transformer: generating music with long-term structure (2018). arXiv preprint arXiv:1809.04281 (2018)
Li, S., Sung, Y.: MRBERT: pre-training of melody and rhythm for automatic music generation. Mathematics 11(4), 798 (2023)
Article Google Scholar
Li, S., Sung, Y.: Transformer-based seq2seq model for chord progression generation. Mathematics 11(5), 1111 (2023)
Article Google Scholar
Pfleiderer, M., Frieler, K., Abeßer, J., Zaddach, W.G., Burkhart, B. (eds.): Inside the Jazzomat - New Perspectives for Jazz Research. Schott Campus (2017)
Google Scholar
van den Oord, A., et al.: WaveNet: a generative model for raw audio. ArXiv e-prints (1609.03499) (2016)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wu, S.L., Yang, Y.H.: The jazz transformer on the front line: exploring the shortcomings of ai-composed music through quantitative measures. arXiv preprint arXiv:2008.01307 (2020)
Zeng, M., Tan, X., Wang, R., Ju, Z., Qin, T., Liu, T.Y.: Musicbert: symbolic music understanding with large-scale pre-training. arXiv preprint arXiv:2106.05630 (2021)

Download references

Author information

Authors and Affiliations

KTH-Royal Institute of Technology, Stockholm, Sweden
David Dalmazzo & Bob L. T. Sturm
Univ. Lille, CNRS, Centrale Lille, UMR 9189 CRIStAL, 59000, Lille, France
Ken Déguernel

Authors

David Dalmazzo
View author publications
You can also search for this author in PubMed Google Scholar
Ken Déguernel
View author publications
You can also search for this author in PubMed Google Scholar
Bob L. T. Sturm
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to David Dalmazzo or Bob L. T. Sturm .

Editor information

Editors and Affiliations

University of Nottingham, Nottingham, UK
Colin Johnson
University of Coimbra, Coimbra, Portugal
Sérgio M. Rebelo
University of Coruña, Coruña, Spain
Iria Santos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dalmazzo, D., Déguernel, K., Sturm, B.L.T. (2024). The Chordinator: Modeling Music Harmony by Implementing Transformer Networks and Token Strategies. In: Johnson, C., Rebelo, S.M., Santos, I. (eds) Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2024. Lecture Notes in Computer Science, vol 14633. Springer, Cham. https://doi.org/10.1007/978-3-031-56992-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-56992-0_4
Published: 29 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56991-3
Online ISBN: 978-3-031-56992-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Chordinator: Modeling Music Harmony by Implementing Transformer Networks and Token Strategies