Learning the Long-Term Structure of the Blues
In general music composed by recurrent neural networks (RNNs) suffers from a lack of global structure. Though networks can learn note-by-note transition probabilities and even reproduce phrases, they have been unable to learn an entire musical form and use that knowledge to guide composition. In this study, we describe model details and present experimental results showing that LSTM successfully learns a form of blues music and is able to compose novel (and some listeners believe pleasing) melodies in that style. Remarkably, once the network has found the relevant structure it does not drift from it: LSTM is able to play the blues with good timing and proper structure as long as one is willing to listen.
KeywordsRecurrent Neural Network Cell Block Music Composition Musical Form Sheet Music
Unable to display preview. Download preview PDF.
- 2.F. A. Gers, J.A. Perez-Ortiz, D. Eck, and J. Schmidhuber. DEKF-LSTM. In Proc. 10th European Symposium on Artifical Neural Networks, ESANN 2002, 2002.Google Scholar
- 3.F. A. Gers and J. Schmidhuber. Recurrent nets that time and count. In Proc. IJCNN’2000, Int. Joint Conf. on Neural Networks, Como, Italy, 2000.Google Scholar
- 5.S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In S. C. Kremer and J. F. Kolen, editors, A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, 2001.Google Scholar
- 7.Michael C. Mozer. Induction of multiscale temporal structure. In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 4, pages 275–282. San Mateo, CA: Morgan Kaufmann, 1992.Google Scholar
- 8.Michael C. Mozer. Neural network composition by prediction: Exploring the benefits of psychophysical constraints and multiscale processing. Cognitive Science, 6:247–280, 1994.Google Scholar
- 9.Juan Antonio Pérez-Ortiz, Juergen Schmidhuber, Felix A. Gers, and Douglas Eck. Improving long-term online prediction with decoupled extended kalman filters. In Artificial Neural Networks — ICANN 2002 (Proceedings), 2002.Google Scholar
- 10.A. J. Robinson and F. Fallside. The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department, 1987.Google Scholar
- 11.C. Stevens and J. Wiles. Representations of tonal music: A case study in the development of temporal relationship. In M.C. Mozer, P. Smolensky, D.S. Touretsky, J.L Elman, and A. S. Weigend, editors, Proceedings of the 1993 Connectionist Models Summer School, pages 228–235. Erlbaum, Hillsdale, NJ, 1994.Google Scholar