Multimedia Tools and Applications

, Volume 74, Issue 1, pp 287–302 | Cite as

Music structure analysis using self-similarity matrix and two-stage categorization

Article

Abstract

Music tends to have a distinct structure consisting of repetition and variation of components such as verse and chorus. Understanding such a music structure and its pattern has become increasingly important for music information retrieval (MIR). Thus far, many different methods for music segmentation and structure analysis have been proposed; however, each method has its advantages and disadvantages. By considering the significant variations in timbre, articulation and tempo of music, this is still a challenging task. In this paper, we propose a novel method for music segmentation and its structure analysis. For this, we first extract the timbre feature from the acoustic music signal and construct a self-similarity matrix that shows the similarities among the features within the music clip. Further, we determine the candidate boundaries for music segmentation by tracking the standard deviation in the matrix. Furthermore, we perform two-stage categorization: (i) categorization of the segments in a music clip on the basis of the timbre feature and (ii) categorization of segments in the same category on the basis of the successive chromagram features. In this way, each music clip is represented by a sequence of states where each state represents a certain category defined by two-stage categorization. We show the performance of our proposed method through experiments.

Keywords

Music structure Music segmentation Signal processing Self-similarity matrix 

Notes

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (NRF-2013R1A1A2012627) and the MSIP(Ministry of Science, ICT&Future Planning), Korea, under the C-ITRC(Convergence Information Technology Research Center) support program (NIPA-2013-H0301-13-3006) supervised by the NIPA(National IT Industry Promotion Agency).

References

  1. 1.
    AllMusic. http://www.allmusic.com/. Accessed 24 October 2013.
  2. 2.
    Cooper M, Foote J (2002) Automatic music summarization via similarity analysis. Proceedings of the international conference on musical information retrieval (ISMIR), pp 81–85Google Scholar
  3. 3.
    Cooper M, Foote J (2003) Summarizing popular music via structural similarity analysis. IEEE workshop on applications of signal processing to audio and acoustics, pp 127–130Google Scholar
  4. 4.
    Foote J (1999) Visualizing music and audio using self-similarity. Proceedings of ACM international conference on multimedia (ACM MM). pp 77–80Google Scholar
  5. 5.
    Foote J (2000) Automatic audio segmentation using a measure of audio novelty. Proceedings of IEEE international conference on multimedia and expo (ICME), vol. 1. pp 452–455Google Scholar
  6. 6.
    Fujishima T (1999) Realtime chord recognition of musical sound: a system using common lisp music. Proceedings of international computer music conference (ICMC), pp 464–467Google Scholar
  7. 7.
    Jun S, Hwang E (2013) Music segmentation and summarization based on self-similarity matrix. Proceedings of the 7th international conference on ubiquitous information management and communication. 82:1–4Google Scholar
  8. 8.
    Jun S, Rho S, Hwang E (2010) Music retrieval and recommendation scheme based on varying mood sequences. Int J Semant Web Inf Syst 6(2):1–16. doi: 10.4018/jswis.2010040101 CrossRefGoogle Scholar
  9. 9.
    Kaiser F, Sikora T (2010) Music structure discovery in popular music using non-negative matrix factorization. Proceedings of international conference on music information retrieval (ISMIR), pp 429–434Google Scholar
  10. 10.
    Klapuri A (1999) Sound onset detection by applying psychoacoustic knowledge. Proceedings of IEEE international conference on acoustics, speech, and signal, vol.6. pp 3089–3092Google Scholar
  11. 11.
    Logan B (2000) Mel frequency cepstral coefficients for music modeling. Proceedings of international conference on music information retrieval (ISMIR)Google Scholar
  12. 12.
    Lu L, Wang M, Zhang H-J (2004) Repeating pattern discovery and structure analysis from acoustic music data. ACM SIGMM international workshop on multimedia information retrieval, pp 275–282Google Scholar
  13. 13.
    Maddage NC, Xu C, Kankanhalli MS, Shao X (2004) Content-based music structure analysis with applications to music semantics understanding. Proceedings of ACM international conference on multimedia (ACM MM), pp 112–119Google Scholar
  14. 14.
    Paulus J, Klapuri A (2009) Music structure analysis using a probabilistic fitness measure and a greedy search algorithm. IEEE Trans Audio Speech Lang Process 17:1159–1170. doi: 10.1109/TASL.2009.2020533 CrossRefGoogle Scholar
  15. 15.
    Peeters G (2004) Deriving musical structures from signal analysis for music audio summary generation: “Sequence” and “State” approach. Computer music modeling and retrieval. Springer Berlin, Heidelberg, pp 169–185Google Scholar
  16. 16.
    Peeters G (2007) Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. Proceedings of the international conference on musical information retrieval (ISMIR), pp 35–40Google Scholar
  17. 17.
    Rabiner L, Juang B-H (1993) Fundamentals of speech recognition. Prentice HallGoogle Scholar
  18. 18.
    Serrà J, Müller M, Grosche P, Arcos JL (2012) Unsupervised detection of music boundaries by time series structure features. Proceedings of twenty-Sixth AAAI Conference on Artificial Intelligence, pp 1613–1619Google Scholar
  19. 19.
    Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10:293–302. doi: 10.1109/TSA.2002.800560 CrossRefGoogle Scholar
  20. 20.
    Wang M, Lu L, Zhang H-J (2004) Repeating pattern discovery from acoustic musical signals. Proceedings of IEEE international conference on multimedia and expo (ICME), vol. 3, pp 2019–2022Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.School of Electrical EngineeringKorea UniversitySeoulKorea
  2. 2.Department of Computer EngineeringMevlana UniversityKonyaTurkey

Personalised recommendations