Skip to main content
Log in

Attentional networks for music generation

  • 1193: Intelligent Processing of Multimedia Signals
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Realistic music generation has always remained as a challenging problem as it may lack structure or rationality. In this work, we propose a deep learning based music generation method in order to produce old style music particularly JAZZ with rehashed melodic structures utilizing a Bi-directional Long Short Term Memory (Bi-LSTM) Neural Network with attention. Owing to the success in modelling long-term temporal dependencies in sequential data and its success in case of videos, Bi-LSTMs with attention serves as a natural choice and early utilization in music generation. We validate in our experiments that Bi-LSTMs with attention are able to preserve the richness and technical nuances of the music performed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://web.mit.edu/music21/doc/moduleReference/moduleConverter.html

  2. https://www.kaggle.com/saikayala/jazz-ml-ready-midi

References

  1. Abraham A (2005) Artificial neural networks. Handbook of measuring system design

  2. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

    Article  Google Scholar 

  3. Boulanger-Lewandowski N, Bengio Y, Vincent P (2012) Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription. arXiv:1206.6392

  4. Browne CB (2001) System and method for automatic music generation using a neural network architecture. US Patent 6, 297, 439

  5. Chen CC, Miikkulainen R (2001) Creating melodies with evolving recurrent neural networks. In: IJCNN’01. International Joint Conference on Neural Networks. Proceedings (Cat. No. 01CH37222), vol 3. IEEE, pp 2241–2246

  6. Dieleman S, Schrauwen B (2014) End-to-end learning for music audio. In: 2014 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6964–6968

  7. Dong HW, Hsiao WY, Yang LC, Yang YH (2018) Musegan: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In: Thirty-second AAAI conference on artificial intelligence

  8. Drewes F, Högberg J. (2007) An algebra for tree-based music generation. In: International conference on algebraic informatics. Springer, pp 172–188

  9. Eck D, Schmidhuber J (2002) Finding temporal structure in music: Blues improvisation with lstm recurrent networks. In: Proceedings of the 12th IEEE workshop on neural networks for signal processing. IEEE, pp 747–756

  10. Eck D, Schmidhuber J (2002) A first look at music composition using lstm recurrent neural networks. Istit Dalle Molle Stud Sull Intell Artif 103:48

    Google Scholar 

  11. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  12. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18 (5-6):602–610

    Article  Google Scholar 

  13. Hadjeres G, Nielsen F (2017) Interactive music generation with positional constraints using anticipation-rnns. arXiv:1709.06404

  14. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  15. Johnson DD (2017) Generating polyphonic music using tied parallel networks. In: International conference on evolutionary and biologically inspired music and art. Springer, pp 128–143

  16. Lee H, Pham P, Largman Y, Ng A (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. Adv Neural Inf Process Syst 22:1096–1104

    Google Scholar 

  17. Lewis J (1988) Creation by refinement: a creativity paradigm for gradient descent learning networks. In: International conf. on neural networks, pp 229–233

  18. Liu I, Ramakrishnan B, et al. (2014) Bach in 2014: Music composition with recurrent neural network. arXiv:1412.3191

  19. Oord A.v.d., Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: A generative model for raw audio. arXiv:1609.03499

  20. Schulze W, Van Der Merwe B (2010) Music generation with markov models. IEEE MultiMedia (3):78–85

  21. Todd P (1988) A sequential network design for musical applications. In: Proceedings of the 1988 connectionist models summer school, pp 76–84

  22. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: NIPS

  23. Yang LC, Chou SY, Yang YH (2017) Midinet: A convolutional generative adversarial network for symbolic-domain music generation. arXiv:1703.10847

  24. Ycart A, Benetos E, et al. (2017) A study on lstm networks for polyphonic music sequence modelling. ISMIR

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prerana Mukherjee.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Keerti, G., Vaishnavi, A.N., Mukherjee, P. et al. Attentional networks for music generation. Multimed Tools Appl 81, 5179–5189 (2022). https://doi.org/10.1007/s11042-021-11881-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11881-1

Keywords

Navigation