Skip to main content

Music Structure Analysis from Acoustic Signals

  • Chapter
Handbook of Signal Processing in Acoustics

Music is full of structure, including sections, sequences of distinct musical textures, and the repetition of phrases or entire sections. The analysis of music audio relies upon feature vectors that convey information about music texture or pitch content. Texture generally refers to the average spectral shape and statistical fluctuation, often reflecting the set of sounding instruments, e.g., strings, vocal, or drums. Pitch content reflects melody and harmony, which is often independent of texture. Structure is found in several ways. Segment boundaries can be detected by observing marked changes in locally averaged texture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 629.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 799.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 799.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aucouturier, J.-J., F. Pachet, and M. Sandler. 2005. “ ‘The Way It Sounds’: Timbre Models for Structural Analysis and Retrieval of Music Signals”. IEEE Trans. on Multimedia 7(6), 1028–1035.

    Google Scholar 

  • Aucouturier, J.-J., and M. Sandler. 2001. “Segmentation of Musical Signals Using Hidden Markov Models”. 110th Convention of the Audio Engineering Society, Preprint No. 5379.

    Google Scholar 

  • Aucouturier, J.-J., and M. Sandler. 2002. “Finding Repeating Patterns in Acoustic Musical Signals: Applications for Audio Thumbnailing”. In AES22 Int. Conf. on Virtual, Synthetic and Entertainment Audio. Audio Engineering Society, pp. 412–421.

    Google Scholar 

  • Bartsch, M., and G. H. Wakefield. 2001. “To Catch a Chorus: Using Chroma-Based Representations for Audio Thumbnailing”. Proc. the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2001). New York: IEEE, pp. 15–18.

    Google Scholar 

  • Cooper, M., and J. Foote. 2003. “Summarizing Popular Music via Structural Similarity Analysis”. Proc. 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2003). New York: IEEE, pp. 127–130.

    Google Scholar 

  • Dannenberg, R. B. 2002. “Listening to ‘Naima’: An Automated Structural Analysis from Recorded Audio”. Proc. 2002 Int. Computer Music Conf. (ICMC 2002). San Francisco: International Computer Music Association, pp. 28–34.

    Google Scholar 

  • Dannenberg, R. B., and N. Hu. 2002. “Discovering Musical Structure in Audio Recordings”. In Anagnostopoulou, C., et al. eds. Music and Artificial Intelligence, Second International Conference, (ICMAI 2002), Berlin: Springer-Verlag, pp. 43–57

    Google Scholar 

  • Dannenberg, R. B., and N. Hu. 2003. “Pattern Discovery Techniques for Music Audio”. J. of New Music Research, 32(2), 153–164.

    Article  Google Scholar 

  • Foote, J. 1999. “Visualizing Music and Audio Using Self-Similarity”. Proc. ACM Multimedia ’99. New York: Association for Computing Machinery, pp. 77–80.

    Google Scholar 

  • Foote, J. 2000. “Automatic Audio Segmentation Using a Measure of Audio Novelty”. Proc. Int. Conf. on Multimedia and Expo (ICME 2000). New York: IEEE, pp. 452–455.

    Google Scholar 

  • Galas, T., and X. Rodet. 1990. “An Improved Cepstral Method for Deconvolution of Source–Filter Systems with Discrete Speara: Application to Musical Sounds”. Proc. 1990 Int. Computer Music Conf. (ICMC 1990). San Francisco: International Computer Music Association, pp. 82–84.

    Google Scholar 

  • Goto, M. 2003a. “A Chorus-Section Detecting Method for Musical Audio Signals”. Proc. 2003 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2003). New York: IEEE, pp. V-437–V-440.

    Google Scholar 

  • Goto, M. 2003b. “SmartMusicKIOSK: Music Listening Station with Chorus-Search Function”. Proc. 16th Annual ACM Symposium on User Interface Software and Technology (UIST 2003). New York: Association for Computing Machinery, pp. 31–40.

    Google Scholar 

  • Goto, M. 2004. “A Real-Time Music Scene Description System: Predominant-F0 Estimation for Detecting Melody and Bass Lines in Real-World Audio Signals”. Speech Communication (ISCA Journal), 43(4), 311–329.

    Google Scholar 

  • Goto, M., T. Nishimura, H. Hashiguchi, and R. Oka. 2002. “RWC Music Database: Popular, Classical, and Jazz Music Databases”. Proc. 3rd Int. conf. on Music Information Retrieval (ISMIR 2002). Paris: IRCAM, pp. 287–288.

    Google Scholar 

  • Hu, N., R. B. Dannenberg, and G. Tzanetakis. 2003. “Polyphonic Audio Matching and Alignment for Music Retrieval”. Proc. 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2003). New York: IEEE, pp. 185–188.

    Google Scholar 

  • Leavers, V. F. 1992. Shape Detection in Computer Vision Using the Hough Transform. Berlin: Springer-Verlag.

    Google Scholar 

  • Logan, B., and S. Chu. 2000. “Music Summarization Using Key Phrases”. Proc. 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings (ICASSP 2000). New York: IEEE, pp. II-749–II-752.

    Google Scholar 

  • Lu, L., M. Wang, and H.-J. Zhang. 2004. “Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data”. Proc. 6th ACM SIGMM Int. Workshop on Multimedia Information Retrieval. New York: Association for Computing Machinery, pp. 275–282.

    Google Scholar 

  • Peeters, G., A. L. Burthe, and X. Rodet. 2002. “Toward Automatic Audio Summary Generation from Signal Analysis”. Proc. 3rd Int. conf. on Music Information Retrieval (ISMIR 2002). Paris: IRCAM, pp. 94–100.

    Google Scholar 

  • Peeters, G., and X. Rodet. 2003. “Signal-Based Music Structure Discovery for Music Audio Summary Generation”. Proc. 2003 Int. Computer Music Conf. (ICMC 2003). San Francisco: Int. Computer Music Association, pp. 15–22.

    Google Scholar 

  • Rabiner, L., and B.-H. Juang. 1993. Fundamentals of Speech Recognition. Englewood Cliffs, NJ: Prentice Hall.

    Google Scholar 

  • Shepard, R. 1964. “Circularity in Judgements of Relative Pitch”. J. Acoust. Soc. Am., 36(12), 2346–2353.

    Article  ADS  Google Scholar 

  • Tolonen, T., and M. Karjalainen. 2000. “A Computationally Efficient Multi-pitch Analysis Model”. IEEE Trans. on Speech and Audio Processing, 8(6), 708–716.

    Google Scholar 

  • Tzanetakis, G., and P. Cook. 1999. “Multifeature Audio Segmentation for Browsing and Annotation”. In Proc. 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 1999). New York: IEEE.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Dannenberg, R.B., Goto, M. (2008). Music Structure Analysis from Acoustic Signals. In: Havelock, D., Kuwano, S., Vorländer, M. (eds) Handbook of Signal Processing in Acoustics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30441-0_21

Download citation

Publish with us

Policies and ethics