Abstract
This paper describes an analyzer for detecting local grouping boundaries and generating metrical structures of music pieces based on a generative theory of tonal music (GTTM). Although systems for automatically detecting local grouping boundaries and generating metrical structures, such as the full automatic time-span tree analyzer, have been proposed, musicologists have to correct the boundaries or strong beat positions due to numerous errors. In light of this, we use a deep learning technique for detecting local boundaries and generating metrical structures of music pieces based on a GTTM. Because we only have 300 pieces of music with the local grouping boundaries and metrical structures analyzed by musicologist, directly learning the relationship between the scores and metrical structures is difficult due to the lack of training data. To solve this problem, we propose a multi-task learning analyzer called deepGTM-I&II based on the above deep learning technique to learn the relationship between scores and metrical structures in the following three steps. First, we conduct unsupervised pre-training of a network using 15,000 pieces of music in a non-labeled dataset. After pre-training, the network involves supervised fine-tuning by back propagation from output to input layers using a half-labeled dataset, which consists of 15,000 pieces of music labeled with an automatic analyzer that we previously constructed. Finally, the network involves supervised fine-tuning using a labeled dataset. The experimental results indicate that deepGTTM-I&II outperformed previous analyzers for a GTTM in terms of the F-measure for generating metrical structures.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Cambouropoulos, E.: The Local Boundary Detection Model (LBDM) and its application in the study of expressive timing. In: Proceedings of the International Computer Music Conference (ICMC 2001), pp. 290–293 (2001)
Cooper, G., Meyer, L.B.: The Rhythmic Structure of Music. The University of Chicago Press, Chicago (1960)
Davies, M., Bock, S.: Evaluating the evaluation measures for beat tracking. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR 2014), pp. 637–642 (2014)
Dixon, S.: Automatic extraction of tempo and beat from expressive performance. J. New Music Res. 30(1), 39–58 (2001)
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)
Goto, M.: An audio-based real-time beat tracking system for music with or without drum-sounds. J. New Music Res. 30(2), 159–171 (2001)
Hamanaka, M., Hirata, K., Tojo, S.: Implementing ‘a generative theory of tonal music’. J. New Music Res. 35(4), 249–277 (2006)
Hamanaka, M., Hirata, K., Tojo, S.: FATTA: full automatic time-span tree analyzer. In: Proceedings of the 2007 International Computer Music Conference (ICMC 2007), pp. 153–156 (2007)
Hamanaka, M., Hirata, K., Tojo, S.: Melody expectation method based on GTTM and TPS. In: Proceeding of the 2008 International Society for Music Information Retrieval Conference (ISMIR 2008), pp. 107–112 (2008)
Hamanaka, M., Hirata, K., Tojo, S.: Melody morphing method based on GTTM. In: Proceeding of the 2008 International Computer Music Conference (ICMC 2008), pp. 155–158 (2008)
Hamanaka, M., Hirata, K., Tojo, S.: Interactive GTTM Analyzer. In: Proceedings of the 10th International Conference on Music Information Retrieval Conference (ISMIR 2009), pp. 291–296 (2009)
Hamanaka, M., Hirata, K., Tojo, S.: Music structural analysis database based on GTTM. In: Proceedings of the 2014 International Society for Music Information Retrieval Conference (ISMIR 2014), pp. 325–330 (2014)
Hamanaka, M., Hirata, K., Tojo, S.: \(\sigma \)GTTM III: learning-based time-span tree generator based on PCFG. In: Kronland-Martinet, R., Aramaki, M., Ystad, S. (eds.) CMMR 2015. LNCS, vol. 9617, pp. 387–404. Springer, Cham (2016). doi:10.1007/978-3-319-46282-0_25
Hamanaka, M., Hirata, K., Tojo, S.: Implementing methods for analysing music based on Lerdahl and Jackendoff’s generative theory of tonal music. Comput. Music Anal., pp. 221–249. Springer, Cham (2016). doi:10.1007/978-3-319-25931-4_9
Hamanaka, M.: Interactive GTTM Analyzer/GTTM Database. http://gttm.jp. Accessed 4 Jan 2017
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Hirata, K., Matsuda, S.: Interactive music summarization based on generative theory of tonal music. J. New Music Res. 32(2), 165–177 (2003)
Hirata, K., Hiraga, R.: Ha-Hi-Hun plays Chopin’s Etude. In: Working Notes of IJCAI-03 Workshop on Methods for Automatic Music Performance and Their Applications in a Public Rendering Contest, pp. 72–73 (2003)
Hirata, K., Matsuda, S.: Annotated music for retrieval, reproduction, and sharing. In: Proceeding of International Computer Music Conference (ICMC 2004), pp. 584–587 (2004)
Kanamori, K., Hamanaka, M.: Method to detect GTTM local grouping boundaries based on clustering and statistical learning. In: Proceedings of the 2014 International Computer Music Conference (ICMC 2014), pp. 125–128 (2014)
Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music. MIT Press, Cambridge (1983)
Lerdahl, F.: Tonal Pitch Space. Oxford University Press, Oxford (2001)
MakeMusic Inc.: Finale. http://www.finalemusic.com/. Accessed 4 Jan 2017
Marsden, A.: Software for Schenkerian analysis. In: Proceeding of International Computer Music Conference (ICMC2011), pp. 673–676 (2011)
Miura, Y., Hamanaka, M., Hirata, K., Tojo, S.: Use of decision tree to detect GTTM group boundaries. In: Proceedings of the 2009 International Computer Music Conference (ICMC 2009), pp. 125–128 (2009)
Nakamura, E., Hamanaka, M., Hirata, K., Yoshii, K.: Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music. In: Proceedings of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), pp. 276–280 (2016)
Narmour, E.: The Analysis and Cognition of Basic Melodic Structure. University of Chicago Press, Chicago (1990)
Narmour, E.: The Analysis and Cognition of Melodic Complexity. The University of Chicago Press, Chicago (1992)
Oshima, T., Hamanaka, M., Hirata, K., Tojo, S., Nagao, K.: Development of discussion structure editor for discussion mining based on music theory. In: IPSJ SIG DCC, 7 p. (2013). (in Japanese)
Pearce, M.T., Müllensiefen, D., Wiggins, G.A.: A comparison of statistical and rule-based models of melodic segmentation. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR 2008), pp. 89–94 (2008)
Rosenthal, D.: Emulation of human rhythm perception. Comput. Music J. 16(1), 64–76 (1992)
Schenker, H.: Der frei Satz. Universal Edition, Vienna (1935). Published in English as Free Composition, translated and edited by E. Oster. Longman, New York (1979)
Takeuchi, S., Hamanaka, M.: Structure of the film based on the music theory. In: JSAI 2014, 1K5-OS-07b-4 (2014). (in Japanese)
Temperley, D.: The Melisma Music Analyzer (2003). http://www.link.cs.cmu.edu/music-analysis/. Accessed 2017-1-4
Temperley, D.: The Congnition of Basic Musical Structures. MIT Press, Cambridge (2004)
Temperley, D.: Music and Probability. The MIT Press, Cambridge (2007)
Yazawa, S., Hamanaka, M., Utsuro, T.: Melody generation system based on a theory of melody sequences. In: Proceedings of ICAICTA 2014, pp. 347–352 (2014)
Acknowledgments
This work was supported by JSPS KAKENHI Grant Number 25700036, 16H01744, 23500145.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Hamanaka, M., Hirata, K., Tojo, S. (2017). deepGTTM-I&II: Local Boundary and Metrical Structure Analyzer Based on Deep Learning Technique. In: Aramaki, M., Kronland-Martinet, R., Ystad, S. (eds) Bridging People and Sound. CMMR 2016. Lecture Notes in Computer Science(), vol 10525. Springer, Cham. https://doi.org/10.1007/978-3-319-67738-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-67738-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67737-8
Online ISBN: 978-3-319-67738-5
eBook Packages: Computer ScienceComputer Science (R0)