deepGTTM-I&II: Local Boundary and Metrical Structure Analyzer Based on Deep Learning Technique

Hamanaka, Masatoshi; Hirata, Keiji; Tojo, Satoshi

doi:10.1007/978-3-319-67738-5_1

deepGTTM-I&II: Local Boundary and Metrical Structure Analyzer Based on Deep Learning Technique

Masatoshi Hamanaka¹⁶,
Keiji Hirata¹⁷ &
Satoshi Tojo¹⁸

Conference paper
First Online: 16 September 2017

903 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10525))

Abstract

This paper describes an analyzer for detecting local grouping boundaries and generating metrical structures of music pieces based on a generative theory of tonal music (GTTM). Although systems for automatically detecting local grouping boundaries and generating metrical structures, such as the full automatic time-span tree analyzer, have been proposed, musicologists have to correct the boundaries or strong beat positions due to numerous errors. In light of this, we use a deep learning technique for detecting local boundaries and generating metrical structures of music pieces based on a GTTM. Because we only have 300 pieces of music with the local grouping boundaries and metrical structures analyzed by musicologist, directly learning the relationship between the scores and metrical structures is difficult due to the lack of training data. To solve this problem, we propose a multi-task learning analyzer called deepGTM-I&II based on the above deep learning technique to learn the relationship between scores and metrical structures in the following three steps. First, we conduct unsupervised pre-training of a network using 15,000 pieces of music in a non-labeled dataset. After pre-training, the network involves supervised fine-tuning by back propagation from output to input layers using a half-labeled dataset, which consists of 15,000 pieces of music labeled with an automatic analyzer that we previously constructed. Finally, the network involves supervised fine-tuning using a labeled dataset. The experimental results indicate that deepGTTM-I&II outperformed previous analyzers for a GTTM in terms of the F-measure for generating metrical structures.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Cambouropoulos, E.: The Local Boundary Detection Model (LBDM) and its application in the study of expressive timing. In: Proceedings of the International Computer Music Conference (ICMC 2001), pp. 290–293 (2001)
Google Scholar
Cooper, G., Meyer, L.B.: The Rhythmic Structure of Music. The University of Chicago Press, Chicago (1960)
Google Scholar
Davies, M., Bock, S.: Evaluating the evaluation measures for beat tracking. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR 2014), pp. 637–642 (2014)
Google Scholar
Dixon, S.: Automatic extraction of tempo and beat from expressive performance. J. New Music Res. 30(1), 39–58 (2001)
Article MathSciNet Google Scholar
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)
MathSciNet MATH Google Scholar
Goto, M.: An audio-based real-time beat tracking system for music with or without drum-sounds. J. New Music Res. 30(2), 159–171 (2001)
Article MathSciNet Google Scholar
Hamanaka, M., Hirata, K., Tojo, S.: Implementing ‘a generative theory of tonal music’. J. New Music Res. 35(4), 249–277 (2006)
Article Google Scholar
Hamanaka, M., Hirata, K., Tojo, S.: FATTA: full automatic time-span tree analyzer. In: Proceedings of the 2007 International Computer Music Conference (ICMC 2007), pp. 153–156 (2007)
Google Scholar
Hamanaka, M., Hirata, K., Tojo, S.: Melody expectation method based on GTTM and TPS. In: Proceeding of the 2008 International Society for Music Information Retrieval Conference (ISMIR 2008), pp. 107–112 (2008)
Google Scholar
Hamanaka, M., Hirata, K., Tojo, S.: Melody morphing method based on GTTM. In: Proceeding of the 2008 International Computer Music Conference (ICMC 2008), pp. 155–158 (2008)
Google Scholar
Hamanaka, M., Hirata, K., Tojo, S.: Interactive GTTM Analyzer. In: Proceedings of the 10th International Conference on Music Information Retrieval Conference (ISMIR 2009), pp. 291–296 (2009)
Google Scholar
Hamanaka, M., Hirata, K., Tojo, S.: Music structural analysis database based on GTTM. In: Proceedings of the 2014 International Society for Music Information Retrieval Conference (ISMIR 2014), pp. 325–330 (2014)
Google Scholar
Hamanaka, M., Hirata, K., Tojo, S.: \(\sigma \)GTTM III: learning-based time-span tree generator based on PCFG. In: Kronland-Martinet, R., Aramaki, M., Ystad, S. (eds.) CMMR 2015. LNCS, vol. 9617, pp. 387–404. Springer, Cham (2016). doi:10.1007/978-3-319-46282-0_25
Chapter Google Scholar
Hamanaka, M., Hirata, K., Tojo, S.: Implementing methods for analysing music based on Lerdahl and Jackendoff’s generative theory of tonal music. Comput. Music Anal., pp. 221–249. Springer, Cham (2016). doi:10.1007/978-3-319-25931-4_9
Chapter Google Scholar
Hamanaka, M.: Interactive GTTM Analyzer/GTTM Database. http://gttm.jp. Accessed 4 Jan 2017
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Hirata, K., Matsuda, S.: Interactive music summarization based on generative theory of tonal music. J. New Music Res. 32(2), 165–177 (2003)
Article Google Scholar
Hirata, K., Hiraga, R.: Ha-Hi-Hun plays Chopin’s Etude. In: Working Notes of IJCAI-03 Workshop on Methods for Automatic Music Performance and Their Applications in a Public Rendering Contest, pp. 72–73 (2003)
Google Scholar
Hirata, K., Matsuda, S.: Annotated music for retrieval, reproduction, and sharing. In: Proceeding of International Computer Music Conference (ICMC 2004), pp. 584–587 (2004)
Google Scholar
Kanamori, K., Hamanaka, M.: Method to detect GTTM local grouping boundaries based on clustering and statistical learning. In: Proceedings of the 2014 International Computer Music Conference (ICMC 2014), pp. 125–128 (2014)
Google Scholar
Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music. MIT Press, Cambridge (1983)
Google Scholar
Lerdahl, F.: Tonal Pitch Space. Oxford University Press, Oxford (2001)
Google Scholar
MakeMusic Inc.: Finale. http://www.finalemusic.com/. Accessed 4 Jan 2017
Marsden, A.: Software for Schenkerian analysis. In: Proceeding of International Computer Music Conference (ICMC2011), pp. 673–676 (2011)
Google Scholar
Miura, Y., Hamanaka, M., Hirata, K., Tojo, S.: Use of decision tree to detect GTTM group boundaries. In: Proceedings of the 2009 International Computer Music Conference (ICMC 2009), pp. 125–128 (2009)
Google Scholar
Nakamura, E., Hamanaka, M., Hirata, K., Yoshii, K.: Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music. In: Proceedings of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), pp. 276–280 (2016)
Google Scholar
Narmour, E.: The Analysis and Cognition of Basic Melodic Structure. University of Chicago Press, Chicago (1990)
Google Scholar
Narmour, E.: The Analysis and Cognition of Melodic Complexity. The University of Chicago Press, Chicago (1992)
Google Scholar
Oshima, T., Hamanaka, M., Hirata, K., Tojo, S., Nagao, K.: Development of discussion structure editor for discussion mining based on music theory. In: IPSJ SIG DCC, 7 p. (2013). (in Japanese)
Google Scholar
Pearce, M.T., Müllensiefen, D., Wiggins, G.A.: A comparison of statistical and rule-based models of melodic segmentation. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR 2008), pp. 89–94 (2008)
Google Scholar
Rosenthal, D.: Emulation of human rhythm perception. Comput. Music J. 16(1), 64–76 (1992)
Article Google Scholar
Schenker, H.: Der frei Satz. Universal Edition, Vienna (1935). Published in English as Free Composition, translated and edited by E. Oster. Longman, New York (1979)
Google Scholar
Takeuchi, S., Hamanaka, M.: Structure of the film based on the music theory. In: JSAI 2014, 1K5-OS-07b-4 (2014). (in Japanese)
Google Scholar
Temperley, D.: The Melisma Music Analyzer (2003). http://www.link.cs.cmu.edu/music-analysis/. Accessed 2017-1-4
Temperley, D.: The Congnition of Basic Musical Structures. MIT Press, Cambridge (2004)
Google Scholar
Temperley, D.: Music and Probability. The MIT Press, Cambridge (2007)
MATH Google Scholar
Yazawa, S., Hamanaka, M., Utsuro, T.: Melody generation system based on a theory of melody sequences. In: Proceedings of ICAICTA 2014, pp. 347–352 (2014)
Google Scholar

Download references

Acknowledgments

This work was supported by JSPS KAKENHI Grant Number 25700036, 16H01744, 23500145.

Author information

Authors and Affiliations

Center for Advanced Integrated Intelligence Project, RIKEN, Nihonbashi 1-chome Mitsui Building, 15F 1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan
Masatoshi Hamanaka
Future University Hakodate, 116-2 Kamedanakano-cho, Hakodate, Hokkaido, 041-8655, Japan
Keiji Hirata
Graduate School of Information Science, Japan Advanced Institute of Science and Technology (JAIST), 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan
Satoshi Tojo

Authors

Masatoshi Hamanaka
View author publications
You can also search for this author in PubMed Google Scholar
Keiji Hirata
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Tojo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masatoshi Hamanaka .

Editor information

Editors and Affiliations

Laboratoire PRISM, CNRS-AMU, Marseille, France
Mitsuko Aramaki
Laboratoire PRISM, CNRS-AMU, Marseille, France
Richard Kronland-Martinet
Laboratoire PRISM, CNRS-AMU, Marseille, France
Sølvi Ystad

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hamanaka, M., Hirata, K., Tojo, S. (2017). deepGTTM-I&II: Local Boundary and Metrical Structure Analyzer Based on Deep Learning Technique. In: Aramaki, M., Kronland-Martinet, R., Ystad, S. (eds) Bridging People and Sound. CMMR 2016. Lecture Notes in Computer Science(), vol 10525. Springer, Cham. https://doi.org/10.1007/978-3-319-67738-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-67738-5_1
Published: 16 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67737-8
Online ISBN: 978-3-319-67738-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics