Impact of Frame Size and Instrumentation on Chroma-Based Automatic Chord Recognition

  • Daniel StollerEmail author
  • Matthias Mauch
  • Igor Vatolkin
  • Claus Weihs
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


This paper presents a comparative study of classification performance in automatic audio chord recognition based on three chroma feature implementations, with the aim of distinguishing effects of frame size, instrumentation, and choice of chroma feature. Until recently, research in automatic chord recognition has focused on the development of complete systems. While results have remarkably improved, the understanding of the error sources remains lacking. In order to isolate sources of chord recognition error, we create a corpus of artificial instrument mixtures and investigate (a) the influence of different chroma frame sizes and (b) the impact of instrumentation and pitch height. We show that recognition performance is significantly affected not only by the method used, but also by the nature of the audio input. We compare these results to those obtained from a corpus of more than 200 real-world pop songs from The Beatles and other artists for the case in which chord boundaries are known in advance.


Frame Size Cosine Similarity Music Piece Pitch Height Chroma Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Anglade, A., Ramirez, R., & Dixon, S. (2009). Genre classification using harmony rules induced from automatic chord transcriptions. In Proceedings of the 10th International Conference on Music Information Retrieval (ISMIR) (pp. 669–674).Google Scholar
  2. Baume, C. (2013). Evaluation of acoustic features for music emotion recognition. In Proceedings of Audio Engineering Society Convention 134.Google Scholar
  3. Celma, O. (2010). Music recommendation and discovery. Berlin/Heidelberg: Springer.Google Scholar
  4. Fujishima, T. (1999). Realtime chord recognition of musical sound: A system using common LISP music. In Proceedings of the International Computer Music Conference (ICMC) (pp. 464–467).Google Scholar
  5. Gómez, E. (2006). Tonal description of music audio signals. PhD thesis, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona.Google Scholar
  6. Goto, M. (2003). A chorus-section detecting method for musical Audio signals. In Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 437–440).Google Scholar
  7. Goto, M., Hasiguchi, H., Nishimura, T., & Oka, R. (2003). RWC music database: Music genre database and musical instrument sound database. In Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR) (pp. 229–230).Google Scholar
  8. Lartillot, O., & Toivainen, P. (2007). MIR in matlab (II): A toolbox for musical feature extraction from audio. In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR) (pp. 127–130).Google Scholar
  9. Mattern, V., Vatolkin, I., & Rudolph, G. (2013). A case study about the effort to classify music intervals by chroma and spectrum analysis. In B. Lausen, D. Van den Poel, & A. Ultsch (Eds.), Algorithms from and for nature and life (pp. 519–528). Berlin/Heidelberg: SpringerGoogle Scholar
  10. Mauch, M. (2010). Automatic chord transcription from audio using computational models of musical context. PhD thesis, School of Electronic Engineering and Computer Science Queen Mary, University of London.Google Scholar
  11. Mauch, M., & Dixon, S. (2010). Approximate note transcription for the improved identification of difficult chords. In Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR) (pp. 135–140).Google Scholar
  12. Mauch, M., Cannam, C., Davies, M., Dixon, S., Harte, C., Kolozali, S., Tidhar, D., & Sandler, M. (2009). OMRAS2 metadata project 2009. In Late-Breaking Session at the 10th International Conference on Music Information Retrieval (ISMIR).Google Scholar
  13. Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., & Euler, T. (2006) YALE: Rapid prototyping for complex data mining tasks. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (pp. 935–940).Google Scholar
  14. Jiang, N., Grosche, P., Konz, V., & Müller, M. (2011). Analyzing chroma feature types for automated chord recognition. In Proceedings of the 42nd AES Conference (pp. 285–294).Google Scholar
  15. Randel, D. M. (1999). The Harvard Concise Dictionary of Music and Musicians. Cambridge: Harvard University Press.Google Scholar
  16. Sheh, A., & Ellis, D. P. W. (2003). Chord segmentation and recognition using EM-trained hidden markov models. In Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR). (pp. 135–141).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Daniel Stoller
    • 1
    Email author
  • Matthias Mauch
    • 2
  • Igor Vatolkin
    • 1
  • Claus Weihs
    • 3
  1. 1.TU DortmundChair of Algorithm EngineeringDortmundGermany
  2. 2.Queen Mary University of London, Centre for Digital MusicLondonUK
  3. 3.TU DortmundChair of Computational StatisticsDortmundGermany

Personalised recommendations