Applying Multiple Kernel Learning to Automatic Genre Classification

  • Hanna Lukashevich
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


In this paper we demonstrate the advantages of multiple-kernel learning in the application to music genre classification. Multiple-kernel learning provides the possibility to adaptively tune the kernel settings to each group of features independently. Our experiments show the improvement of classification performance in comparison to the conventional support vector machine classifier.


Support Vector Machine Radial Basis Function Kernel Audio Feature Multiple Kernel Learn Genre Classification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work has been partly supported by the German research project GlobalMusic2One 3 funded by the Federal Ministry of Education and Research (BMBF-FKZ: 01/S08039B). Additionally, the Thuringian Ministry of Economy, Employment and Technology supported this research by granting funds of the European Fund for Regional Development to the project Songs2See 4, enabling transnational cooperation between Thuringian companies and their partners from other European regions.


  1. Atlas L, Shamma S (2003) Joint acoustic and modulation frequency. EURASIP J Appl Signal Proces 7:668–675CrossRefGoogle Scholar
  2. Aucouturier JJ, Defreville B, Pachet F (2007) The bag-of-frames approach to audio pattern recognition: A sufficient model for urban soundscapes but not for polyphonic music. J Acoust Soc Am 122(2):881–891CrossRefGoogle Scholar
  3. Barrington L, Yazdani M, Turnbull D, Lanckriet G (2008) Combining feature kernels for semantic music retrieval. In: Proc. of the 9th Intl. Conf. on Music Information Retrieval (ISMIR), pp 614–619Google Scholar
  4. Bello JP, Pickens J (2005) A robust mid-level representation for harmonic content in music signals. In: Proc. of the 6th Int. Conf. on Music Information Retrieval (ISMIR), London, UK, pp 304–311Google Scholar
  5. Dittmar C, Bastuck C, Gruhne M (2007) Novel mid-level audio features for music similarity. In: Proc. of the Int. Conf. on Music Communication Science (ICOMCS), Sydney, AustraliaGoogle Scholar
  6. Essid S (2005) Classification automatique des signaux audio-fréquences: Reconnaissance des instruments de musique. PhD thesis, l’Université Pierre et Marie Curie, Paris, FranceGoogle Scholar
  7. Gatzsche G, Mehnert M, Gatzsche D, Brandenburg K (2007) A symmetry based approach for musical tonality analysis. In: Proc. of the 8th Int. Conf. on Music Information Retrieval (ISMIR), Vienna, Austria, pp 207–210Google Scholar
  8. Gruhne M, Dittmar C (2009) Comparison of harmonic mid-level representations for genre recognition. In: Proc. of the 3rd Workshop on Learning the Semantics of Audio Signals (LSAS), Graz, Austria, pp 91–102Google Scholar
  9. Gruhne M, Dittmar C, Gaertner D (2009) Improving rhythmic similarity computation by beat histogram transformations. In: Proc. of the 10th Int. Society for Music Information Retrieval Conf. (ISMIR), Kobe, JapanGoogle Scholar
  10. Lanckriet G, Cristianini N, Bartlett P, Ghaoui LE, Jordan M (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72zbMATHGoogle Scholar
  11. Lee K (2006) Automatic chord recognition from audio using enhanced pitch class profile. In: Proc. of the Int. Computer Music Conf. (ICMC), New Orleans, USA, pp 306–313Google Scholar
  12. Nakajima S, Binder A, Müller C, Wojcikiewicz W, Kloft M, Brefeld U, Müller KR, Kawanabe M (2009) Multiple kernel learning for object classification. Tech. rep., Information-Based Induction Sciences, Fukuoka, JapanGoogle Scholar
  13. Peeters G, Rodet X (2003) Hierarchical gaussian tree with inertia ratio maximization for the classification of large musical instruments databases. In: Proc. of the 6th Intl. Conf. on Digital Audio Effects (DAFx)., London, UKGoogle Scholar
  14. Sonnenburg S, Rätsch G, Schäfer C (2005) Learning interpretable SVMs for biological sequence classification. In: Miyano S, Mesirov J, Kasif S, Istrail S, Pevzner P, Waterman M (eds) Research in Computational Molecular Biology, Lecture Notes in Computer Science, vol 3500, Springer, Berlin/Heidelberg, pp 389–407Google Scholar
  15. Sonnenburg S, Rätsch G, Schäfer C, Schölkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7:1531–1565MathSciNetzbMATHGoogle Scholar
  16. Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302CrossRefGoogle Scholar
  17. Uhle C, Dittmar C, Sporer T (2003) Extraction of drum tracks from polyphonic music using independent subspace analysis. In: Proc. of the 4th Int. Symposium on Independent Component Analysis (ICA), Nara, Japan, pp 843–848Google Scholar
  18. Vapnik V (1998) Statistical learning theory. Wiley, New YorkzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Fraunhofer Institute for Digital Media TechnologyIlmenauGermany

Personalised recommendations