Music structure analysis using self-similarity matrix and two-stage categorization
- 302 Downloads
Music tends to have a distinct structure consisting of repetition and variation of components such as verse and chorus. Understanding such a music structure and its pattern has become increasingly important for music information retrieval (MIR). Thus far, many different methods for music segmentation and structure analysis have been proposed; however, each method has its advantages and disadvantages. By considering the significant variations in timbre, articulation and tempo of music, this is still a challenging task. In this paper, we propose a novel method for music segmentation and its structure analysis. For this, we first extract the timbre feature from the acoustic music signal and construct a self-similarity matrix that shows the similarities among the features within the music clip. Further, we determine the candidate boundaries for music segmentation by tracking the standard deviation in the matrix. Furthermore, we perform two-stage categorization: (i) categorization of the segments in a music clip on the basis of the timbre feature and (ii) categorization of segments in the same category on the basis of the successive chromagram features. In this way, each music clip is represented by a sequence of states where each state represents a certain category defined by two-stage categorization. We show the performance of our proposed method through experiments.
KeywordsMusic structure Music segmentation Signal processing Self-similarity matrix
This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (NRF-2013R1A1A2012627) and the MSIP(Ministry of Science, ICT&Future Planning), Korea, under the C-ITRC(Convergence Information Technology Research Center) support program (NIPA-2013-H0301-13-3006) supervised by the NIPA(National IT Industry Promotion Agency).
- 1.AllMusic. http://www.allmusic.com/. Accessed 24 October 2013.
- 2.Cooper M, Foote J (2002) Automatic music summarization via similarity analysis. Proceedings of the international conference on musical information retrieval (ISMIR), pp 81–85Google Scholar
- 3.Cooper M, Foote J (2003) Summarizing popular music via structural similarity analysis. IEEE workshop on applications of signal processing to audio and acoustics, pp 127–130Google Scholar
- 4.Foote J (1999) Visualizing music and audio using self-similarity. Proceedings of ACM international conference on multimedia (ACM MM). pp 77–80Google Scholar
- 5.Foote J (2000) Automatic audio segmentation using a measure of audio novelty. Proceedings of IEEE international conference on multimedia and expo (ICME), vol. 1. pp 452–455Google Scholar
- 6.Fujishima T (1999) Realtime chord recognition of musical sound: a system using common lisp music. Proceedings of international computer music conference (ICMC), pp 464–467Google Scholar
- 7.Jun S, Hwang E (2013) Music segmentation and summarization based on self-similarity matrix. Proceedings of the 7th international conference on ubiquitous information management and communication. 82:1–4Google Scholar
- 9.Kaiser F, Sikora T (2010) Music structure discovery in popular music using non-negative matrix factorization. Proceedings of international conference on music information retrieval (ISMIR), pp 429–434Google Scholar
- 10.Klapuri A (1999) Sound onset detection by applying psychoacoustic knowledge. Proceedings of IEEE international conference on acoustics, speech, and signal, vol.6. pp 3089–3092Google Scholar
- 11.Logan B (2000) Mel frequency cepstral coefficients for music modeling. Proceedings of international conference on music information retrieval (ISMIR)Google Scholar
- 12.Lu L, Wang M, Zhang H-J (2004) Repeating pattern discovery and structure analysis from acoustic music data. ACM SIGMM international workshop on multimedia information retrieval, pp 275–282Google Scholar
- 13.Maddage NC, Xu C, Kankanhalli MS, Shao X (2004) Content-based music structure analysis with applications to music semantics understanding. Proceedings of ACM international conference on multimedia (ACM MM), pp 112–119Google Scholar
- 15.Peeters G (2004) Deriving musical structures from signal analysis for music audio summary generation: “Sequence” and “State” approach. Computer music modeling and retrieval. Springer Berlin, Heidelberg, pp 169–185Google Scholar
- 16.Peeters G (2007) Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. Proceedings of the international conference on musical information retrieval (ISMIR), pp 35–40Google Scholar
- 17.Rabiner L, Juang B-H (1993) Fundamentals of speech recognition. Prentice HallGoogle Scholar
- 18.Serrà J, Müller M, Grosche P, Arcos JL (2012) Unsupervised detection of music boundaries by time series structure features. Proceedings of twenty-Sixth AAAI Conference on Artificial Intelligence, pp 1613–1619Google Scholar
- 20.Wang M, Lu L, Zhang H-J (2004) Repeating pattern discovery from acoustic musical signals. Proceedings of IEEE international conference on multimedia and expo (ICME), vol. 3, pp 2019–2022Google Scholar