Skip to main content

Music Structure Analysis

Abstract

One of the attributes distinguishing music from random sound sources is the hierarchical structure in which music is organized. At the lowest level, one has events such as individual notes, which are characterized by the way they sound, their timbre, pitch, and duration. Combining various sound events, one obtains larger structures such as motifs, phrases, and sections, and these structures again form larger constructs that determine the overall layout of the composition. This higher structural level is also referred to as the musical structure of the piece, which is specified in terms of musical parts and their mutual relations. For example, in popular music such parts can be the intro, the chorus, and the verse sections of the song. Or in classical music, they can be the exposition, the development, and the recapitulation of a movement.

Keywords

  • Dynamic Time Warping
  • Novelty Detection
  • Path Structure
  • Popular Music
  • Music Information Retrieval

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-21945-5_4
  • Chapter length: 70 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-21945-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J.-J. AUCOUTURIER AND F. PACHET, Improving timbre similarity: How high’s the sky, Journal of Negative Results in Speech and Audio Sciences, 1 (2004).

    Google Scholar 

  2. M. A. BARTSCH AND G. H. WAKEFIELD, Audio thumbnailing of popular music using chroma-based representations, IEEE Transactions on Multimedia, 7 (2005), pp. 96–104.

    Google Scholar 

  3. M. A. CASEY AND M. SLANEY, The importance of sequences in musical similarity, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France, 2006.

    Google Scholar 

  4. A. T. CEMGIL, B. KAPPEN, P. DESAIN, AND H. HONING, On tempo tracking: Tempogram representation and Kalman filtering, Journal of New Music Research, 28 (2001), pp. 259–273.

    Google Scholar 

  5. W. CHAI AND B. VERCOE, Music thumbnailing via structural analysis, in Proceedings of the ACM International Conference on Multimedia, Berkeley, California, USA, 2003, pp. 223–226.

    Google Scholar 

  6. M. COOPER AND J. FOOTE, Automatic music summarization via similarity analysis, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Paris, France, 2002, pp. 81–85.

    Google Scholar 

  7. ———, Summarizing popular music via structural similarity analysis, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 2003, pp. 127–130.

    Google Scholar 

  8. R. B. DANNENBERG AND M. GOTO, Music structure analysis from acoustic signals, in Handbook of Signal Processing in Acoustics, D. Havelock, S. Kuwano, and M. Vorl¨ander, eds., vol. 1, Springer, New York, NY, USA, 2008, pp. 305–331.

    Google Scholar 

  9. R. B. DANNENBERG AND N. HU, Pattern discovery techniques for music audio, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Paris, France, 2002, pp. 63–70.

    Google Scholar 

  10. S. B. DAVIS AND P. MERMELSTEIN, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, Readings in Speech Recognition, (1990), pp. 65–74.

    Google Scholar 

  11. E. R. DOUGHERTY, An Introduction to Morphological Image Processing, SPIE Optical Engineering Press, Bellingham, WA, USA, 1992.

    Google Scholar 

  12. J. S. DOWNIE, The music information retrieval evaluation exchange (2005–2007): A window into music information retrieval research, Acoustical Science and Technology, 29 (2008), pp. 247–255.

    Google Scholar 

  13. J.-P. ECKMANN, S. O. KAMPHORST, AND D. RUELLE, Recurrence plots of dynamical systems, Europhysics Letters, 4 (1987), pp. 973–977.

    Google Scholar 

  14. J. FOOTE, Visualizing music and audio using self-similarity, in Proceedings of the ACM International Conference on Multimedia, Orlando, Florida, USA, 1999, pp. 77–80.

    Google Scholar 

  15. ———, Automatic audio segmentation using a measure of audio novelty, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), New York, NY, USA, 2000, pp. 452–455.

    Google Scholar 

  16. J. FOOTE AND S. UCHIHASHI, The beat spectrum: A new approach to rhythm analysis, in Proceedings of the International Conference on Multimedia and Expo (ICME), Los Alamitos, California, USA, 2001.

    Google Scholar 

  17. M. GOTO, A chorus-section detecting method for musical audio signals, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, China, 2003, pp. 437–440.

    Google Scholar 

  18. ———, AIST annotation for the RWC music database, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), 2006, pp. 359–360.

    Google Scholar 

  19. ———, A chorus section detection method for musical audio signals and its application to a music listening station, IEEE Transactions on Audio, Speech, and Language Processing, 14 (2006), pp. 1783–1794.

    Google Scholar 

  20. M. GOTO, H. HASHIGUCHI, T. NISHIMURA, AND R. OKA, RWC music database: Popular, classical and jazz music databases, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Paris, France, 2002.

    Google Scholar 

  21. H. GROHGANZ, Algorithmen zur strukturellen Analyse von Musikaufnahmen, PhD thesis, University of Bonn, 2015.

    Google Scholar 

  22. H. GROHGANZ, M. CLAUSEN, N. JIANG, AND M. MÜLLER, Converting path structures into block structures using eigenvalue decompositions of self-similarity matrices, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Curitiba, Brazil, 2013, pp. 209–214.

    Google Scholar 

  23. P. GROSCHE, M. MÜLLER, AND F. KURTH, Cyclic tempogram – a mid-level tempo representation for music signals, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas, USA, 2010, pp. 5522 – 5525.

    Google Scholar 

  24. P. GROSCHE, M. MÜLLER, AND J. SERRÀ, Audio content-based music retrieval, in Multimodal Music Processing, M.Müller, M. Goto, and M. Schedl, eds., vol. 3 of Dagstuhl Follow-Ups, Schloss Dagstuhl–Leibniz-Zentrum f¨ur Informatik, Dagstuhl, Germany, 2012, pp. 157–174.

    Google Scholar 

  25. D. B. HURON, Sweet anticipation: Music and the psychology of expectation, The MIT Press, 2006.

    Google Scholar 

  26. K. JENSEN, Multiple scale music segmentation using rhythm, timbre, and harmony, EURASIP Journal on Advances in Signal Processing, (2007).

    Google Scholar 

  27. F. KAISER, M. G. ARVANITIDOU, AND T. SIKORA, Audio similarity matrices enhancement in an image processing framework, in International Workshop on Content-Based Multimedia Indexing (CBMI), Madrid, Spain, 2011.

    Google Scholar 

  28. F. KAISER AND T. SIKORA, Music structure discovery in popular music using non-negative matrix factorization, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, 2010, pp. 429–434.

    Google Scholar 

  29. O. LARTILLOT, MIRtoolbox 1.5, User’s Manual. https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/mirtoolbox/MIRtoolbox1.5Guide/, Retrieved 10.09.2013, 2013.

  30. O. LARTILLOT AND P. TOIVIAINEN, MIR in Matlab (II): A toolbox for musical feature extraction from audio, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Vienna, Austria, 2007, pp. 127–130.

    Google Scholar 

  31. F. LERDAHL AND R. JACKENDOFF, A Generative Theory of Tonal Music, MIT Press, 1983.

    Google Scholar 

  32. M. LEVY, M. SANDLER, AND M. A. CASEY, Extraction of high-level musical structure from audio data and its application to thumbnail generation, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France, 2006, pp. 13–16.

    Google Scholar 

  33. H. LUKASHEVICH, Towards quantitative measures of evaluating song segmentation, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Philadelphia, USA, 2008, pp. 375–380.

    Google Scholar 

  34. N. C. MADDAGE, Automatic structure detection for popular music, IEEE Multimedia, 13 (2006), pp. 65–77.

    Google Scholar 

  35. N. MARWAN, M. C. ROMANO, M. THIEL, AND J. KURTHS, Recurrence plots for the analysis of complex systems, Physics Reports, 438 (2007), pp. 237–329.

    Google Scholar 

  36. M. MAUCH, Automatic Chord Transcription from Audio Using Computational Models of Musical Context, PhD thesis, Queen Mary University of London, 2010.

    Google Scholar 

  37. M. MAUCH, C. CANNAM, M. E. DAVIES, S. DIXON, C. HARTE, S. KOLOZALI, D. TIDHAR, AND M. SANDLER, OMRAS2 metadata project 2009, in Late Breaking Demo of the International Conference on Music Information Retrieval (ISMIR), Kobe, Japan, 2009.

    Google Scholar 

  38. B. MCFEE AND D. ELLIS, Analyzing song structure with spectral clustering, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Taipei, Taiwan, 2014, pp. 405–410.

    Google Scholar 

  39. R. MIDDLETON, Form, in Key terms in popular music and culture, B. Horner and T. Swiss, eds., Wiley-Blackwell, 1999, pp. 141–155.

    Google Scholar 

  40. M. MÜLLER, Information Retrieval for Music and Motion, Springer Verlag, 2007.

    Google Scholar 

  41. M. MÜLLER AND M. CLAUSEN, Transposition-invariant self-similarity matrices, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 47–50.

    Google Scholar 

  42. M. MÜLLER AND N. JIANG, A scape plot representation for visualizing repetitive structures of music recordings, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Porto, Portugal, 2012, pp. 97–102.

    Google Scholar 

  43. M. MÜLLER, N. JIANG, AND H. GROHGANZ, SM Toolbox: MATLAB implementations for computing and enhancing similiarty matrices, in Proceedings of the AES Conference on Semantic Audio, London, UK, 2014.

    Google Scholar 

  44. M. MÜLLER, N. JIANG, AND P. GROSCHE, A robust fitness measure for capturing repetitions in music recordings with applications to audio thumbnailing, IEEE Transactions on Audio, Speech, and Language Processing, 21 (2013), pp. 531–543.

    Google Scholar 

  45. M. MÜLLER AND F. KURTH, Enhancing similarity matrices for music audio analysis, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France, 2006, pp. 437–440.

    Google Scholar 

  46. ———, Towards structural analysis of audio recordings in the presence of musical variations, EURASIP Journal on Advances in Signal Processing, 2007 (2007).

    Google Scholar 

  47. O. NIETO, M. FARBOOD, T. JEHAN, AND J. P. BELLO, Perceptual analysis of the F-measure to evaluate section boundaries in music, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Taipei, Taiwan, 2014, pp. 265–270.

    Google Scholar 

  48. O. NIETO, E. J. HUMPHREY, AND J. P. BELLO, Compressing music recordings into audio summaries, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Curitiba, Brazil, 2013, pp. 313–318.

    Google Scholar 

  49. A. OCKELFORD, Repetition in music: theoretical and metatheoretical perspectives., vol. 13 of Royal Musical Association Monographs, Ashgate Publishing, 2005.

    Google Scholar 

  50. J. PAULUS AND A. P. KLAPURI, Music structure analysis by finding repeated parts, in Proceedings of the ACM Audio and Music Computing Multimedia Workshop, Santa Barbara, California, USA, 2006, pp. 59–68.

    Google Scholar 

  51. ———, Acoustic features for music piece structure analysis, in Proceedings of the International Conference on Digital Audio Effects (DAFx), Espoo, Finland, 2008, pp. 309–312.

    Google Scholar 

  52. ———, Music structure analysis using a probabilistic fitness measure and a greedy search algorithm, IEEE Transactions on Audio, Speech, and Language Processing, 17 (2009), pp. 1159–1170.

    Google Scholar 

  53. J. PAULUS, M. MÜLLER, AND A. P. KLAPURI, Audio-based music structure analysis, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, 2010, pp. 625–636.

    Google Scholar 

  54. G. PEETERS, Deriving musical structure from signal analysis for music audio summary generation: “sequence” and “state” approach, in Computer Music Modeling and Retrieval, vol. 2771 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2004, pp. 143–166.

    Google Scholar 

  55. G. PEETERS, Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Vienna, Austria, 2007, pp. 35–40.

    Google Scholar 

  56. C. RHODES AND M. A. CASEY, Algorithms for determining and labelling approximate hierarchical self-similarity, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Vienna, Austria, 2007, pp. 41–46.

    Google Scholar 

  57. C. S. SAPP, Harmonic visualizations of tonal music, in Proceedings of the International Computer Music Conference (ICMC), La Habana, Cuba, 2001, pp. 423–430.

    Google Scholar 

  58. ———, Visual hierarchical key analysis, ACM Computers in Entertainment, 3 (2005), pp. 1–19.

    Google Scholar 

  59. H. SCHENKER, Der freie Satz, Universal, Vienna, 1935.

    Google Scholar 

  60. J. SERRA, Image Analysis and Mathematical Morphology, Academic Press, Inc., Orlando, Florida, USA, 1984.

    Google Scholar 

  61. J. SERRÀ, E. GÓMEZ, P. HERRERA, AND X. SERRA, Chroma binary similarity and local alignment applied to cover song identification, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 1138–1151.

    Google Scholar 

  62. J. SERRÀ , M. MÜLLER, P. GROSCHE, AND J. L. ARCOS, Unsupervised detection of music boundaries by time series structure features, in Proceedings of the AAAI International Conference on Artificial Intelligence, Toronto, Ontario, Canada, 2012.

    Google Scholar 

  63. J. SERRÀ , X. SERRA, AND R. G. ANDRZEJAK, Cross recurrence quantification for cover song identification, New Journal of Physics, 11 (2009).

    Google Scholar 

  64. J. B. L. SMITH, J. A. BURGOYNE, I. FUJINAGA, D. D. ROURE, AND J. S. DOWNIE, Design and creation of a large-scale database of structural annotations, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Miami, Florida, USA, 2011, pp. 555–560.

    Google Scholar 

  65. J. B. L. SMITH AND E. CHEW, Using quadratic programming to estimate feature relevance in structural analyses of music, in Proceedings of the ACM International Conference on Multimedia, 2013, pp. 113–122.

    Google Scholar 

  66. M. SUNKEL, S. JANSEN, M. WAND, E. EISEMANN, AND H.-P. SEIDEL, Learning line features in 3D geometry, Computer Graphics Forum, 30 (2011), pp. 267–276.

    Google Scholar 

  67. H. TERASAWA, M. SLANEY, AND J. BERGER, The thirteen colors of timbre, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2005, pp. 323–326.

    Google Scholar 

  68. D. TURNBULL, G. LANCKRIET, E. PAMPALK, AND M. GOTO, A supervised approach for detecting boundaries in music using difference features and boosting, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Vienna, Austria, 2007, pp. 51–54.

    Google Scholar 

  69. G. TZANETAKIS AND P. COOK, Multifeature audio segmentation for browsing and annotation, in Proceedings of the IEEEWorkshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Platz, NY, USA, 1999, pp. 103–106.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meinard Müller .

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Müller, M. (2015). Music Structure Analysis. In: Fundamentals of Music Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-21945-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21945-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21944-8

  • Online ISBN: 978-3-319-21945-5

  • eBook Packages: Computer ScienceComputer Science (R0)