Skip to main content

Classification of Music Genres Based on Mel-Frequency Cepstrum Coefficients Using Deep Learning Models

  • Conference paper
  • First Online:
Disruptive Technologies for Big Data and Cloud Applications

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 905))

  • 480 Accesses

Abstract

Genre classification is indeed a vital task today since the number of songs produced on a regular basis keeps increasing. On average, around, 60,000 tracks are being uploaded per day on Spotify. So, classifying these tracks by genre is definitely an important task for every  musical streaming services and platforms. Due to the high classification performance of neural network models such as convolutional neural network (CNN), multi-layer perceptron (MLP), and long short-term memory network (LSTM) are used in this work to automatically classify music into to its genres based on Mel-frequency cepstrum coefficients (MFCCs) instead of manually entering the genre. We experimented the models with the GTZAN dataset and provided a comparative analysis on the classification efficiency of deep learning models. We achieved a classification of 70.42% for our proposed CNN model which is greater than the human accuracy and over other deep learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. T. Li, M. Ogihara, Q. Li, A comparative study on content-based music genre classification 282 (2003), https://doi.org/10.1145/860435.860487

  2. T. Johnson, Analyzing genre in post-millennial popular music. City Univ. New York, p. 206, Sep 2018, Accessed: 12 Jun 2021. [Online]. Available: https://academicworks.cuny.edu/gc_etds/2884

  3. Y.E. Kim et al., Music emotion recognition: A state of the art review. Proc. Ismir 86, 937–952 (2010)

    Google Scholar 

  4. Z. Fu, G. Lu, K.M. Ting, D. Zhang, A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011). https://doi.org/10.1109/TMM.2010.2098858

    Article  Google Scholar 

  5. C. McKay, I. Fujinaga, P. Depalle, jAudio: A feature extraction library, in Proceedings of the International Conference on Music Information Retrieval (2005), pp. 600–603

    Google Scholar 

  6. A. Karatana, O. Yildiz, Music genre classification with machine learning techniques, 1–4, Apr 2017, https://doi.org/10.1109/siu.2017.7960694

  7. G. Tzanetakis, P. Cook, Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002). https://doi.org/10.1109/TSA.2002.800560

    Article  Google Scholar 

  8. M. Dong, Convolutional neural network achieves human-level accuracy in music genre classification, Feb 2018

    Google Scholar 

  9. A.R. Rajanna, K. Aryafar, A. Shokoufandeh, R. Ptucha, Deep neural networks: A case study for music genre classification, in Proceedings—2015 IEEE 14th International Conference on Machine Learning and Applications, ICMLA 2015, pp. 655–660, Mar 2016, https://doi.org/10.1109/ICMLA.2015.160

  10. B. Logan et al., Mel frequency cepstral coefficients for music modeling, in Ismir, vol. 270, (2000), pp. 1–11

    Google Scholar 

  11. S. Lawrence, C.L. Giles, A.C. Tsoi, A.D. Back, Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997). https://doi.org/10.1109/72.554195

    Article  Google Scholar 

  12. J. Tang, C. Deng, G. Bin Huang, Extreme learning machine for multilayer perceptron, IEEE Trans. Neural Networks Learn. Syst. 27(4), 809–821, Apr 2016, https://doi.org/10.1109/TNNLS.2015.2424995

  13. S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  14. P.C.G. Tzanetakis, GTZAN dataset

    Google Scholar 

  15. W. Zhang, W. Lei, X. Xu, X. Xing, Improved music genre classification with convolutional neural networks, in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2016, vol. 08-12-Sept, pp. 3304–3308, https://doi.org/10.21437/Interspeech.2016-1236

  16. K. Choi, G. Fazekas, M. Sandler, K. Cho, Transfer learning for music classification and regression tasks. Proc. 18th Int. Soc. Music Inf. Retr. Conf. ISMIR 2017, 141–149, Mar 2017

    Google Scholar 

  17. Y.M.G. Costa, L.S. Oliveira, C.N. Silla, An evaluation of convolutional neural networks for music classification using spectrograms. Appl. Soft Comput. J. 52, 28–38 (2017). https://doi.org/10.1016/j.asoc.2016.12.024

    Article  Google Scholar 

  18. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Decem (2016), pp. 770–778, https://doi.org/10.1109/CVPR.2016.90

  19. J. Dai, S. Liang, W. Xue, C. Ni, W. Liu, Long short-term memory recurrent neural network based segment features for music genre classification, 2017. https://doi.org/10.1109/ISCSLP.2016.7918369

  20. K. Choi, G. Fazekas, M. Sandler, K. Cho, Convolutional recurrent neural networks for music classification, in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing—Proceedings, Jun 2017, pp. 2392–2396, https://doi.org/10.1109/ICASSP.2017.7952585

  21. Y.M.G. Costa, L.S. Oliveira, A.L. Koerich, F. Gouyon, J.G. Martins, Music genre classification using LBP textural features. Signal Process. 92(11), 2723–2737 (2012). https://doi.org/10.1016/j.sigpro.2012.04.023

    Article  Google Scholar 

  22. Y. Costa, L. Oliveira, A. Koerich, F. Gouyon, Music genre recognition using gabor filters and LPQ texture descriptors, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013, vol. 8259, LNCS, no. PART 2, pp. 67–74, https://doi.org/10.1007/978-3-642-41827-3_9

  23. M.-J. Wu, J.-S.R. Jang, Combining acoustic and multilevel visual features for music genre classification. ACM Trans. Multimed. Comput. Commun. Appl. 12(10) (2015), https://doi.org/10.1145/2801127

  24. L. Nanni, Y.M.G. Costa, A. Lumini, M.Y. Kim, S.R. Baek, Combining visual and acoustic features for music genre classification. Expert Syst. Appl. 45, 108–117 (2016). https://doi.org/10.1016/j.eswa.2015.09.018

    Article  Google Scholar 

  25. V. Nair, G.E. Hinton, Rectified linear units improve restricted Boltzmann machines

    Google Scholar 

  26. G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, R.R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors, Jul 2012

    Google Scholar 

  27. A.B. Chan, A. Hon, W. Chun, T.L. Li, A.H. Chun, Automatic Musical Pattern Feature Extraction Using Convolutional Neural Network Image/video restoration and people tracking in outdoor environments View project SmartPalette View project Automatic Musical Pattern Feature Extraction Using Convolutional Neu. (2010)

    Google Scholar 

  28. A. van den Oord et al., WaveNet: A Generative model for raw audio, Sep 2016

    Google Scholar 

  29. L. Wyse, Audio spectrogram representations for processing with convolutional neural networks, Jun 2017

    Google Scholar 

  30. A. Schindler, T. Lidy, Parallel Convolutional Neural Networks for Music Genre and Mood Classification Europeana Sounds View Project SCAPE Project View Project Parallel Convolutional Neural Networks for Music Genre and Mood Classification (2017)

    Google Scholar 

  31. F. Gouyon, Y.M.G. Costa, L.S. Oliveira, A.L. Koericb, Music genre recognition using spectrograms, in IEEE Conference Publication, IEEE Xplore

    Google Scholar 

  32. J. Andrew, S.S. Mathew, B. Mohit, A Comprehensive analysis of privacy-preserving techniques in deep learning based disease prediction systems 0–9 (2019), https://doi.org/10.1088/1742-6596/1362/1/012070

  33. J.A. Onesimu, J. Karthikeyan, An efficient privacy-preserving deep learning scheme for medical image analysis. J. Inf. Technol. Manag. 12(Special Issue: The Importance of Human Computer Interaction: Challenges, Methods and Applications), 50–67, Dec 2021, https://doi.org/10.22059/jitm.2020.79191

  34. J. Andrew, R. Fiona, H. Caleb Andrew, Comparative study of various deep convolutional neural networks in the early prediction of cancer, in 2019 International Conference on Intelligent Computing and Control Systems, ICCS 2019, May 2019, pp. 884–890, https://doi.org/10.1109/ICCS45141.2019.9065445

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Andrew .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Preetham, M., Panga, J.B., Andrew, J., Raimond, K., Dang, H. (2022). Classification of Music Genres Based on Mel-Frequency Cepstrum Coefficients Using Deep Learning Models. In: Peter, J.D., Fernandes, S.L., Alavi, A.H. (eds) Disruptive Technologies for Big Data and Cloud Applications. Lecture Notes in Electrical Engineering, vol 905. Springer, Singapore. https://doi.org/10.1007/978-981-19-2177-3_83

Download citation

Publish with us

Policies and ethics