Abstract
Genre classification is indeed a vital task today since the number of songs produced on a regular basis keeps increasing. On average, around, 60,000 tracks are being uploaded per day on Spotify. So, classifying these tracks by genre is definitely an important task for every musical streaming services and platforms. Due to the high classification performance of neural network models such as convolutional neural network (CNN), multi-layer perceptron (MLP), and long short-term memory network (LSTM) are used in this work to automatically classify music into to its genres based on Mel-frequency cepstrum coefficients (MFCCs) instead of manually entering the genre. We experimented the models with the GTZAN dataset and provided a comparative analysis on the classification efficiency of deep learning models. We achieved a classification of 70.42% for our proposed CNN model which is greater than the human accuracy and over other deep learning models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
T. Li, M. Ogihara, Q. Li, A comparative study on content-based music genre classification 282 (2003), https://doi.org/10.1145/860435.860487
T. Johnson, Analyzing genre in post-millennial popular music. City Univ. New York, p. 206, Sep 2018, Accessed: 12 Jun 2021. [Online]. Available: https://academicworks.cuny.edu/gc_etds/2884
Y.E. Kim et al., Music emotion recognition: A state of the art review. Proc. Ismir 86, 937–952 (2010)
Z. Fu, G. Lu, K.M. Ting, D. Zhang, A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011). https://doi.org/10.1109/TMM.2010.2098858
C. McKay, I. Fujinaga, P. Depalle, jAudio: A feature extraction library, in Proceedings of the International Conference on Music Information Retrieval (2005), pp. 600–603
A. Karatana, O. Yildiz, Music genre classification with machine learning techniques, 1–4, Apr 2017, https://doi.org/10.1109/siu.2017.7960694
G. Tzanetakis, P. Cook, Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002). https://doi.org/10.1109/TSA.2002.800560
M. Dong, Convolutional neural network achieves human-level accuracy in music genre classification, Feb 2018
A.R. Rajanna, K. Aryafar, A. Shokoufandeh, R. Ptucha, Deep neural networks: A case study for music genre classification, in Proceedings—2015 IEEE 14th International Conference on Machine Learning and Applications, ICMLA 2015, pp. 655–660, Mar 2016, https://doi.org/10.1109/ICMLA.2015.160
B. Logan et al., Mel frequency cepstral coefficients for music modeling, in Ismir, vol. 270, (2000), pp. 1–11
S. Lawrence, C.L. Giles, A.C. Tsoi, A.D. Back, Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997). https://doi.org/10.1109/72.554195
J. Tang, C. Deng, G. Bin Huang, Extreme learning machine for multilayer perceptron, IEEE Trans. Neural Networks Learn. Syst. 27(4), 809–821, Apr 2016, https://doi.org/10.1109/TNNLS.2015.2424995
S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
P.C.G. Tzanetakis, GTZAN dataset
W. Zhang, W. Lei, X. Xu, X. Xing, Improved music genre classification with convolutional neural networks, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2016, vol. 08-12-Sept, pp. 3304–3308, https://doi.org/10.21437/Interspeech.2016-1236
K. Choi, G. Fazekas, M. Sandler, K. Cho, Transfer learning for music classification and regression tasks. Proc. 18th Int. Soc. Music Inf. Retr. Conf. ISMIR 2017, 141–149, Mar 2017
Y.M.G. Costa, L.S. Oliveira, C.N. Silla, An evaluation of convolutional neural networks for music classification using spectrograms. Appl. Soft Comput. J. 52, 28–38 (2017). https://doi.org/10.1016/j.asoc.2016.12.024
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Decem (2016), pp. 770–778, https://doi.org/10.1109/CVPR.2016.90
J. Dai, S. Liang, W. Xue, C. Ni, W. Liu, Long short-term memory recurrent neural network based segment features for music genre classification, 2017. https://doi.org/10.1109/ISCSLP.2016.7918369
K. Choi, G. Fazekas, M. Sandler, K. Cho, Convolutional recurrent neural networks for music classification, in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing—Proceedings, Jun 2017, pp. 2392–2396, https://doi.org/10.1109/ICASSP.2017.7952585
Y.M.G. Costa, L.S. Oliveira, A.L. Koerich, F. Gouyon, J.G. Martins, Music genre classification using LBP textural features. Signal Process. 92(11), 2723–2737 (2012). https://doi.org/10.1016/j.sigpro.2012.04.023
Y. Costa, L. Oliveira, A. Koerich, F. Gouyon, Music genre recognition using gabor filters and LPQ texture descriptors, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013, vol. 8259, LNCS, no. PART 2, pp. 67–74, https://doi.org/10.1007/978-3-642-41827-3_9
M.-J. Wu, J.-S.R. Jang, Combining acoustic and multilevel visual features for music genre classification. ACM Trans. Multimed. Comput. Commun. Appl. 12(10) (2015), https://doi.org/10.1145/2801127
L. Nanni, Y.M.G. Costa, A. Lumini, M.Y. Kim, S.R. Baek, Combining visual and acoustic features for music genre classification. Expert Syst. Appl. 45, 108–117 (2016). https://doi.org/10.1016/j.eswa.2015.09.018
V. Nair, G.E. Hinton, Rectified linear units improve restricted Boltzmann machines
G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, R.R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors, Jul 2012
A.B. Chan, A. Hon, W. Chun, T.L. Li, A.H. Chun, Automatic Musical Pattern Feature Extraction Using Convolutional Neural Network Image/video restoration and people tracking in outdoor environments View project SmartPalette View project Automatic Musical Pattern Feature Extraction Using Convolutional Neu. (2010)
A. van den Oord et al., WaveNet: A Generative model for raw audio, Sep 2016
L. Wyse, Audio spectrogram representations for processing with convolutional neural networks, Jun 2017
A. Schindler, T. Lidy, Parallel Convolutional Neural Networks for Music Genre and Mood Classification Europeana Sounds View Project SCAPE Project View Project Parallel Convolutional Neural Networks for Music Genre and Mood Classification (2017)
F. Gouyon, Y.M.G. Costa, L.S. Oliveira, A.L. Koericb, Music genre recognition using spectrograms, in IEEE Conference Publication, IEEE Xplore
J. Andrew, S.S. Mathew, B. Mohit, A Comprehensive analysis of privacy-preserving techniques in deep learning based disease prediction systems 0–9 (2019), https://doi.org/10.1088/1742-6596/1362/1/012070
J.A. Onesimu, J. Karthikeyan, An efficient privacy-preserving deep learning scheme for medical image analysis. J. Inf. Technol. Manag. 12(Special Issue: The Importance of Human Computer Interaction: Challenges, Methods and Applications), 50–67, Dec 2021, https://doi.org/10.22059/jitm.2020.79191
J. Andrew, R. Fiona, H. Caleb Andrew, Comparative study of various deep convolutional neural networks in the early prediction of cancer, in 2019 International Conference on Intelligent Computing and Control Systems, ICCS 2019, May 2019, pp. 884–890, https://doi.org/10.1109/ICCS45141.2019.9065445
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Preetham, M., Panga, J.B., Andrew, J., Raimond, K., Dang, H. (2022). Classification of Music Genres Based on Mel-Frequency Cepstrum Coefficients Using Deep Learning Models. In: Peter, J.D., Fernandes, S.L., Alavi, A.H. (eds) Disruptive Technologies for Big Data and Cloud Applications. Lecture Notes in Electrical Engineering, vol 905. Springer, Singapore. https://doi.org/10.1007/978-981-19-2177-3_83
Download citation
DOI: https://doi.org/10.1007/978-981-19-2177-3_83
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2176-6
Online ISBN: 978-981-19-2177-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)