Discrete wavelet packet transform and ensembles of lazy and eager learners for music genre classification

Grimaldi, Marco; Cunningham, P´draig; Kokaram, Anil

doi:10.1007/s00530-006-0027-z

Discrete wavelet packet transform and ensembles of lazy and eager learners for music genre classification

Regular Paper
Published: 20 April 2006

Volume 11, pages 422–437, (2006)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Marco Grimaldi¹,
P´draig Cunningham² &
Anil Kokaram³

111 Accesses
6 Citations
Explore all metrics

Abstract

This paper presents a process for determining the music genre of an item using a new set of descriptors. A discrete wavelet packet transform is applied to obtain the signal representation at two different resolutions: a frequency resolution and a time resolution tuned to encode music notes and their onset and offset. These features are tested on a number of data sets as descriptors for music genre classification. Lazy learning classifiers (k-nearest neighbor) and eager learners (neural networks and support vector machines) are applied in order to assess the classification power of the proposed features. Different feature selection techniques and ensemble methods are explored to maximize the accuracy of the classifiers and stabilize their behavior. Our evaluation shows that these frequency descriptors perform better than a standard approach based on Mel-Frequency Cepstral Coefficients and on the Short Time Fourier Transform in music genre classification. Moreover, our work confirms that a parameterization of the music rhythm based on the beat-histogram provides some meaningful information in the context of music classification by genre.Finally, our evaluation suggests that multi-class support vector machines with a linear kernel and round-robin binarization are the simplest and more effective process for music genre classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Taxonomy of Music Genre Using Machine Intelligence from Feature Melting Technique

An Ensemble Learning Approach for Automatic Emotion Classification of Sri Lankan Folk Music

Solution of the Problem of Classification of Hydroacoustic Signals Based on Harmonious Wavelets and Machine Learning

Article 01 July 2020

References

Allamanche, E., Herre, J., Hellmuth, O., Froeba, B., Kastner, T., Kremer, M.: Content-based identification of audio material using mpeg-7 low level description. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2001). Bloomington, IN, USA (2001)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm (2001)
Cunningham, P., Carney, J.: Diversity versus quality in classification ensembles based on feature selection. In: Eleventh European Conference on Machine Learning (ECML 2000). pp. 109–116 Springer-Verlag, Berlin (2000)
Google Scholar
Daubechies, I.: Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (1992)
Dietterich, T.G.: Ensemble methods in machine learning. Lect. Notes Comput. Sci. 1857, 1–15 (2000)
Article Google Scholar
Dixon, E.P., Widmer, G.: Classification of dance music by periodicity patterns. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2003). Baltimore, MA, USA (2003)
Fausett, L.: Fundamentals of Neural Networks: Architecture, Algorithms and Applications. Prentice-Hall, Englewood Cliffs (1994)
Google Scholar
Foote, T.: Content-based retrieval of music and audio. In: Multimedia Storage and Archiving Systems II, Proceedings of the SPIE, vol. 3229, pp. 138–147 (1997)
Foote, J.: Arthur: retrieving orchestral music by long term structure. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2000). Plymouth, MA, USA, (2000)
Fürnkranz, J.: Round robin rule learning. In: ICML '01: Proceedings of the Eighteenth International Conference on Machine Learning. pp. 146–153. San Francisco, CA, USA, Morgan Kaufmann (2001)
Grimaldi, M., Cunningham, P., Kokaram, A.: A wavelet packet representation of audio signals for music genre classification using different ensemble and feature selection. In: Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval. Berkeley, CA, USA (2003)
Grimaldi, M.: Learning to Annotate Music Files using Content Based Retrieval Systems and Wavelet Packet Approximations of the Input Signals. PhD thesis, Trinity College Dublin (2004)
Hsu, C.-W., Lin, C.-J.: A comparison of methods for multi-class support vector machines. IEEE Trans. Neural Networks 13(2), 415–425 (2002)
Article Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. IEEE Trans. Neural Networks 97(1-2), 273–324 (1997)
MATH Google Scholar
Langley, P.: Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall Symposium on Relevance, New Orleans, USA, AAAI Press (1994)
Li, T., Ogihara, M., Li, Q.: A comparative study on content-based music genre classification. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 282–289 (2003)
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2000). Plymouth, MA, USA, 2000
Mallat, S.G.: A Wavelet Tour of Signal Processing. Academic Press, San Diego (1999)
MATH Google Scholar
Martin, K.: Musical instrument identification: a pattern-recognition approach. In: The 136th Meeting of the Acoustical Society of America (1998)
Martin, K.D.: Toward automatic sound source recognition: identifying musical instruments. In: Proceedings of the NATO Computational Hearing Advanced Institute. Il Ciocco, Italy (1998)
McKinney, M.F., Breebaart, J.: Features for audio and music classification. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2003). Baltimore, MA, USA (2003)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Pienimäki, A.: Indexing music databases using automatic extraction of frequent phrases. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR). Paris, France (2002)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Scheirer, E.D.: Tempo and beat analysis of acoustic musical signals. J. Acouststic Soc. Am. 103(1), 588–601 (1998)
Article Google Scholar
Scheirer, E., Slaney, M.: Construction and evaluation of arobust multifeatures speech/music discriminator. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), vol. 2, pp. 1331–1334. Munich, Germany
Tzanetakis, G., Ermolinskyi, A., Cook, P.: Pitch histograms in audio and symbolic music information retrieval. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2002). Paris, France (2002)
Tzanetakis, G., Essl, G., Cook, P.: Automatic musical genre classification of audio signals. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR). Bloomington, IN, USA (2001)
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Wang, Y., Liu, Z., Huang, J.C.: Multimedia content analysis using both audio and visual clues. IEEE Signal Process. Magazine 12–36 (2000)
Zenobi, G., Cunningham, P.: Using diversity in preparing ensemble of classifiers based on different subsets to minimize generalization error. In: 12th European Conference on Machine Leaming (ECML 2001). Springer-Verlag, Berlin (2001)
Google Scholar
Zhang, T., Kuo, C.C.: Hierarchical classification of audio data for archiving and retrieving. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP). Phoenix, AZ, USA (1999)

Download references

Author information

Authors and Affiliations

Computer Science Department, University College Dublin, Dublin, Ireland
Marco Grimaldi
Computer Science Department, Trinity College Dublin, Dublin, Ireland
P´draig Cunningham
Electronic and Electrical Engineering Department, Trinity College Dublin, Dublin, Ireland
Anil Kokaram

Authors

Marco Grimaldi
View author publications
You can also search for this author in PubMed Google Scholar
P´draig Cunningham
View author publications
You can also search for this author in PubMed Google Scholar
Anil Kokaram
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Grimaldi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grimaldi, M., Cunningham, P. & Kokaram, A. Discrete wavelet packet transform and ensembles of lazy and eager learners for music genre classification. Multimedia Systems 11, 422–437 (2006). https://doi.org/10.1007/s00530-006-0027-z

Download citation

Published: 20 April 2006
Issue Date: May 2006
DOI: https://doi.org/10.1007/s00530-006-0027-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discrete wavelet packet transform and ensembles of lazy and eager learners for music genre classification

Abstract

Access this article

Similar content being viewed by others

Taxonomy of Music Genre Using Machine Intelligence from Feature Melting Technique

An Ensemble Learning Approach for Automatic Emotion Classification of Sri Lankan Folk Music

Solution of the Problem of Classification of Hydroacoustic Signals Based on Harmonious Wavelets and Machine Learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discrete wavelet packet transform and ensembles of lazy and eager learners for music genre classification

Abstract

Access this article

Similar content being viewed by others

Taxonomy of Music Genre Using Machine Intelligence from Feature Melting Technique

An Ensemble Learning Approach for Automatic Emotion Classification of Sri Lankan Folk Music

Solution of the Problem of Classification of Hydroacoustic Signals Based on Harmonious Wavelets and Machine Learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation