Abstract
In this paper, we proposed a robust music genre classification method based on a sparse FFT based feature extraction method which extracted with discriminating power of spectral analysis of non-stationary audio signals, and the capability of sparse representation based classifiers. Feature extraction method combines two sets of features namely short-term features (extracted from windowed signals) and long-term features (extracted from combination of extracted short-time features). Experimental results demonstrate that the proposed feature extraction method leads to a sparse representation of audio signals. As a result, a significant reduction in the dimensionality of the signals is achieved. The extracted features are then fed into a sparse representation based classifier (SRC). Our experimental results on the GTZAN database demonstrate that the proposed method outperforms the other state of the art SRC approaches. Moreover, the computational efficiency of the proposed method is better than that of the other Compressive Sampling (CS)-based classifiers.
Similar content being viewed by others
Notes
Mell Frequency Coefficient
References
Saunders, J. (May 1996). “Real-time discrimination of broadcast speech/music”, in Prociding of IEEE International Conferece Acoustic, Speech. Signal Processing, 2, 993–996.
Tancerel, L., Ragot, S., Ruoppila, V. T., and Lefebvre, R. (2000) “Combined speech and audio coding by discrimination,” In Prociding of IEEE Workshop Speech Coding, pp. 154–156.
Karydis, I., Nanopoulos, A., Papadopoulos, A., Manolopoulos, Y. (2005) “Audio indexing for efficient music information retrieval,” In Prociding of the 11th international multimedia modeling conference, p. 22-29, January 12–14.
Tzanetakis, G., & Cook, P. (2002). Musical genre classification of audio signals. IEEE Transaction on Speech Audio Processing, 10(5), 293–302.
Wold, E., Blum, T., Keislar, D., & Wheaten, J. (1996). Content-based classification, search, and retrieval of audio. IEEE Transaction on Multimedia, 3(3), 27–36.
International Standard, Information Tecnology- Generic Coding of Moving Picturies and Associated Audio: SYSTEMS Recommendation H.222.0, ISO/IEC 13818–1, 1540 Sun 13 November, 1994.
Sukittanon, S., Atlas, L. E., Pitton, J.W., McLaughlin, J. (2003) “Non-stationary signal classification using joint frequency analysis”, in Prociding of IEEE International Conferece Acoustic, Speech, Signal Processing, JUN.
Jacques, L., and Vandergheynst, P. (2010) “Compressed sensing: When sparsity meets sampling”, February
Kaichun K. C., Jyh-Shing R. J., Costas S. I. (2010) “Music gener classification via compressive sampling”, In Prociding of 11th International Society for Music Information Retrieval Conference, Junnery.
Panagakis, Y., Kotropoulos, C. (2010) “Music genre classification via topology preserving non-negative tensor factorization and sparse representations,” In Prociding of IEEE International Conferece Acoustic, Speech, Signal Processing.
Sainath, T., Carmi, A., Kanevsky, D., Ramabhadran, B. (2010) “Bayesian compressive sensing for phonetic classification,” In Prociding of IEEE International Conferece Acoustic, Speech, Signal Processing.
Jang, D., Jin, M., Yoo, C. (2008) “Music genre classification using novel features and a weighted voting method,” In Proceeding of IEEE International Conference on Multimedia and Expo, pp. 1377–1380, April.
Wright, J., Yang, A., Ganesh, A., Shastry, S., & Ma, Y. (February 2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210–227.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Banitalebi-Dehkordi, M., Banitalebi-Dehkordi, A. Music Genre Classification Using Spectral Analysis and Sparse Representation of the Signals. J Sign Process Syst 74, 273–280 (2014). https://doi.org/10.1007/s11265-013-0797-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-013-0797-4