Abstract
With the increasing popularity and availability of online music databases that store vast collections of music, automated classification of music genre has attracted significant attention for the management of such large-scale databases. This paper presents a new music genre classification method that utilizes gradient-based texture analysis of the spectrograms constructed from the audio signals. We propose to use gradient directional pattern (GDP)—a robust local texture descriptor that exploits the gradient directional information to encode the local texture properties of an image. The proposed method first computes spectrograms from the audio signals and then applies the GDP operator to construct the feature descriptors that represent micro-level texture details of the spectrograms. We use a support vector machine (SVM) for the classification task. The effectiveness of the proposed method is evaluated using the GTZAN genre collection music database. Our experiments show promising results for the proposed GDP-based spectrogram texture analysis, as compared against some other existing music genre classification methods.
References
Ahmed, F.: Gradient directional pattern: a robust feature descriptor for facial expression recognition. IET Electron. Lett. 48(19), 1203–1204 (2012)
Ahmed, F., Kabir, M.H.: Directional ternary pattern (dtp) for facial expression recognition. In: IEEE International Conference on Consumer Electronics, pp. 265–266 (2012)
Ahmed, F., Paul, P., Wang, P., Gavrilova, M.: Gender classification from face images based on gradient directional pattern (gdp). In: Internatonal Conference on Computational Science and Its Applications, vol. LNCS 9156, pp. 233–243 (2015)
Costa, Y., Oliveira, L., Koerich, A., Gouyon, F.: Music genre recognition using gabor filters and lpq texture descriptors. In: Iberoamerican Congress on Pattern Recognition, vol. LNCS 8259, pp. 67–74 (2013)
Costa, Y., Oliveira, L., Koerich, A., Gouyon, F.: Music genre recognition using spectrograms. In: International Conference on Systems, Signals and Image Processing, pp. 151–154 (2011)
Costa, Y., Oliveira, L., Koerich, A., Gouyon, F., Martins, J.: Music genre classification using lbp textural features. Sig. Process. 92, 2723–2737 (2012)
Dannenberg, R., Thom, B., Watson, D.: A machine learning approach to musical style recognition. In: International Computer Music Conference (1997)
Ezzaidi, H., Rouat, J.: Automatic musical genre classification using divergence and average information measures. In: Research report of the world academy of science, engineering and technology (2006)
Hermansky, H.: Perceptual linear predictive (plp) analysis of speech. J. Acoust. Soc. Amer. 87(4), 1738–1752 (1990)
Jabid, T., Kabir, M.H., Chae, O.: Robust facial expression recognition based on local directional pattern. ETRI J. 32(5), 784–794 (2010)
Li, T., M, M.O., Li, Q.: A comparative study on content-based music genre classification. In: international ACM SI-GIR conference on research and development in information retrieval, pp. 282–289 (2003)
Lidy, T., Rauber, A.: Evaluation of feature extractors and psychoacoustic transformations for music genre classification. In: International Conference on Music Information Retrieval, pp. 71–80 (2005)
Lidy, T., Silla, C., Cornelis, O., Gouyon, F., Rauber, A., Kaestner, C., Koerich, A.: On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-western and ethnic music collections. Signal 90, 1032–1048 (2010)
McKay, C., Fujinaga, I.: Musical genre classification: is it worth pursuing and how can it be improved? In: International Conference on Music Information Retrieval, pp. 101–106 (2006)
Neammalai, P., Phimoltares, S., Lursinsap, C.: Speech and music classification using hybrid form of spectrogram and fourier transformation. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–6 (2014)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Silla, C.N., Koerich, A.L., Kaestner, C.: Feature selection approach for automatic music genre classification. Int. J. Semant. Comput. 3(2), 183–208 (2009)
Tan, X., Triggs, B.: Enhanced local texture feature sets for face recognition under difficult lighting conditions. In: IEEE International Workshop on Analysis and Modeling of Faces and Gestures, LNCS vol. 4778, pp. 168–182 (2007)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)
Wu, H., Zhang, M.: Gabor-lbp features and combined classifiers for music genre classification. In: International Conference on Computer and Information Application, pp. 419–422 (2012)
Zhao, S., Gao, Y., Zhang, B.: Sobel-lbp. In: IEEE International Conference on Image Processing, pp. 2144–2147 (2008)
Acknowledgments
The authors would like to thank NSERC Discovery Grant Project 1028463, NSERC Engage, AITF, and MITACS Accelerate for partial support of this project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ahmed, F., Paul, P.P., Gavrilova, M. (2016). Music Genre Classification Using a Gradient-Based Local Texture Descriptor. In: Czarnowski, I., Caballero, A.M., Howlett, R.J., Jain, L.C. (eds) Intelligent Decision Technologies 2016. Smart Innovation, Systems and Technologies, vol 57. Springer, Cham. https://doi.org/10.1007/978-3-319-39627-9_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-39627-9_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39626-2
Online ISBN: 978-3-319-39627-9
eBook Packages: EngineeringEngineering (R0)