Advertisement

Machine Learning

, Volume 65, Issue 2–3, pp 473–484 | Cite as

Aggregate features and ADABOOST for music classification

  • James BergstraEmail author
  • Norman Casagrande
  • Dumitru Erhan
  • Douglas Eck
  • Balázs Kégl
Article

Abstract

We present an algorithm that predicts musical genre and artist from an audio waveform. Our method uses the ensemble learner ADABOOST to select from a set of audio features that have been extracted from segmented audio and then aggregated. Our classifier proved to be the most effective method for genre classification at the recent MIREX 2005 international contests in music information extraction, and the second-best method for recognizing artists. This paper describes our method in detail, from feature extraction to song classification, and presents an evaluation of our method on three genre databases and two artist-recognition databases. Furthermore, we present evidence collected from a variety of popular features and classifiers that the technique of classifying features aggregated over segments of audio is better than classifying either entire songs or individual short-timescale features.

Keywords

Genre classification Artist recognition Audio feature aggregation Multiclass ADABOOST MIREX 

References

  1. Ahrendt, P., & Meng, A. (2005). Music Genre Classification using the multivariate AR feature integration model. Extended Abstract. MIREX genre classification contest (www.music-ir.org/evaluation/mirex-results).Google Scholar
  2. Aucouturier, J., & Pachet, F. (2002). Music Similarity Measures: Whats the Use?. In: Fingerhut, M. (ed.): Proceedings of the Third International Conference on Music Information Retrieval (ISMIR 2000).Google Scholar
  3. Aucouturier, J., & Pachet, F. (2003). Representing musical genre: A state of the art. Journal of New Music Research 32(1), 1–12.CrossRefGoogle Scholar
  4. Bello, J., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., & Sandler, M. (2005). A Tutorial on Onset Detection in Music Signals. IEEE Transactions on Speech and Audio Processing.Google Scholar
  5. Bergstra, J., Casagrande, N., & Eck, D. (2005a). Artist Recognition: A Timbre- and Rhythm-Based Multiresolution Approach. MIREX artist recognition contest.Google Scholar
  6. Bergstra, J., Casagrande, N., & Eck, D. (2005b). Genre Classification: A Timbre- and Rhythm-Based Multiresolution Approach. MIREX genre classification contest.Google Scholar
  7. Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford University Press.Google Scholar
  8. Breiman, L. (1996). Bagging Predictors. Machine Learning 24(2), 123–140.zbMATHMathSciNetGoogle Scholar
  9. Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. Machine Learning 20(3), 273–297.zbMATHGoogle Scholar
  10. Crawford, T., & Sandler, M. (eds.) (2005). Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005).Google Scholar
  11. Eck, D., & Casagrande, N. (2005). Finding Meter in Music Using an Autocorrelation Phase Matrix and Shannon Entropy. In: Proc. 6th International Conference on Music Information Retrieval (ISMIR 2005).Google Scholar
  12. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139.zbMATHMathSciNetCrossRefGoogle Scholar
  13. Gold, B., & Morgan, N. (2000). Speech and Audio Signal Processing: Processing and Perception of Speech and Music. Wiley.Google Scholar
  14. Junqua, J., & Haton, J. (1996). Robustness in Automatic Speech Recognition. Boston: Kluwer Academic.Google Scholar
  15. Kedem, B. (1986). Spectral analysis and discrimination by zero-crossings. Proc. IEEE 74(11), 1477–1493.CrossRefGoogle Scholar
  16. Kunt, M. (1986). Digital Signal Processing. Artech House.Google Scholar
  17. Lambrou, T., Kudumakis, P., Speller, R., Sandler, M., & Linney, A. (1998). Classification of audio signals using statistical features on time and wavelet tranform domains. In: Proc. Int. Conf. Acoustic, Speech, and Signal Processing (ICASSP-98), 6, 3621–3624.Google Scholar
  18. Lippens, S., Martens, J., & De Mulder, T. (2004). A comparison of human and automatic musical genre classification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 4, 233–236.Google Scholar
  19. Li, T., Ogihara, M., & Li, Q. (2003). A comparative study on content-based music genre classification. In: SIGIR 03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. New York, NY, USA, (pp. 282–289) ACM Press.Google Scholar
  20. Li, T., & Tzanetakis, G. (2003). Factors in automatic musical genre classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.Google Scholar
  21. Logan, B., & Salomon, A. (2001). A music similarity function based on signal analysis. In: 2001 IEEE International Conference on Multimedia and Expo (ICME’01). (p. 190).Google Scholar
  22. Makhoul, J. (1975). Linear Prediction: A Tutorial Review. In: Proceedings of the IEEE, 63, 561–580.Google Scholar
  23. Mandel, M. I., & Ellis, D. P. (2005a), Song-level features and support vector machines for music classification. In (Crawford and Sandler, 2005).Google Scholar
  24. Mandel, M., & Ellis, D. (2005b). Song-level features and SVMs for music classification. Extended Abstract. MIREX 2005 genre classification contest (www.music-ir.org/evaluation/mirex-results).Google Scholar
  25. Pampalk, E., Flexer, A., & Widmer, G. (2005). Improvements Of Audio-Based Music Similarity And Genre Classification. In (Crawford and Sandler, 2005).Google Scholar
  26. Schapire, R. E., & Singer, Y. (1998). Improved boosting algorithms using confidence-rated predictions. In: COLT 98: Proceedings of the eleventh annual conference on Computational learning theory. New York, NY, USA, (pp. 80–91) ACM Press.Google Scholar
  27. Soltau, H. (1997). Erkennung von Musikstilen. Masters thesis, Universitat Karlsruhe.Google Scholar
  28. Tzanetakis, G., & Cook, P. (2002). Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10(5), 293–302.CrossRefGoogle Scholar
  29. Tzanetakis, G., Ermolinskyi, A., & Cook, P. (2002). Pitch histograms in audio and symbolic music information retrieval. In: Fingerhut, M. (ed.): Proceedings of the Third International Conference on Music Information Retrieval: ISMIR 2002, 31–38.Google Scholar
  30. West, K., & Cox, S. (2004). Features and classifiers for the automatic classification of musical audio signals. In: Proc. 5th International Conference on Music Information Retrieval (ISMIR 2004).Google Scholar
  31. West, K., & Cox, S. (2005). Finding an Optimal Segmentation for Audio Genre Classification. In (Crawford and Sandler, 2005).Google Scholar
  32. Xu, C., Maddage, N.C., Shao, X., & Tian, Q. (2003). Musical Genre Classification Using Support Vector Machines. In: In International Conference of Acoustics, Speech & Signal Processing (ICASSP03).Google Scholar

Copyright information

© Springer Science + Business Media, LLC 2006

Authors and Affiliations

  • James Bergstra
    • 1
    Email author
  • Norman Casagrande
    • 1
  • Dumitru Erhan
    • 1
  • Douglas Eck
    • 1
  • Balázs Kégl
    • 1
  1. 1.Department of Computer ScienceUniversity of MontrealMontrealCanada

Personalised recommendations