Abstract
Manual classification of millions of songs of the same or different genres is a challenging task for human beings. Therefore, there should be a machine intelligent model that can classify the genres of the songs very accurately. In this paper, a deep learning-based hybrid model is proposed for the analysis and classification of different music genre files. The proposed hybrid model mainly uses a combination of multimodal and transfer learning-based models for classification. This model is analyzed using GTZAN and Ballroom datasets. The GTZAN dataset contains 1000 music files classified with 10 different kinds of music genres such as Metal, Classical, Rock, Reggae, Pop, Disco, Blues, Country, Hip-Hop and Jazz, and the duration of each music file is 30 s. The Ballroom dataset contains 698 music files classified into 8 different kinds of music genres such as Tango, ChaChaCha, Rumba, Viennese waltz, Jlive, Waltz, Quickstep and Samba, and the duration of each music file is 30 s. The performance of the model is evaluated using the Python tool. The macro-average and weighted average are taken for computing the percentage of accuracy of each model. From the results, it is found that the proposed hybrid model is able to perform better as compared to other deep learning models such as the convolution neural network model, transfer learning-based model, multimodal model, machine learning models and other existing models in terms of training accuracy, validation accuracy, training loss, validation loss, precision, recall, F1-score and support.
Similar content being viewed by others
Data availability
Data are available on request.
References
Oramas S, Barbieri F, Nieto Caballero O, Serra X (2018) The Multimodal deep learning for music genre classification. Trans Int Soc Music Inf Retr 1(1):4–21. https://doi.org/10.5334/tismir.10
Feng T (2014) Deep learning for music genre classification. Private document. pp. 1–7. https://courses.engr.illinois.edu/ece544na/fa2014/Tao_Feng.pdf
Bahuleyan H (2018) Music genre classification using machine learning techniques. arXiv preprint arXiv:1804.01149
Elbir A, Aydin N (2020) Music genre classification and music recommendation by using deep learning. Electron Lett 56(12):627–629. https://doi.org/10.1049/el.2019.4202
Nanni L, Costa YM, Aguiar RL, Silla CN Jr, Brahnam S (2018) Ensemble of deep learning, visual and acoustic features for music genre classification. J New Music Res 47(4):383–397. https://doi.org/10.1080/09298215.2018.1438476
Kim S, Kim D Suh B (2016) Music genre classification using the multimodal deep learning. In: Proceedings of HCI Korea pp. 389–395. https://doi.org/10.17210/hcik.2016.01.389
Oramas S, Nieto O, Barbieri F, Serra X (2017) Multi-label music genre classification from audio, text, and images using deep features. arXiv preprint arXiv:1707.04916
Vishnupriya S, Meenakshi K (2018) Automatic music genre classification using convolution neural network. In: 2018 International conference on computer communication and informatics (ICCCI). IEEE pp. 1–4. https://doi.org/10.1109/ICCCI.2018.8441340
Lau DS, Ajoodha R (2022) Music genre classification: a comparative study between deep learning and traditional machine learning approaches. In: Proceedings of sixth international congress on information and communication technology. Springer, Singapore pp. 239–247. https://doi.org/10.1007/978-981-16-2102-4_22
Jeong IY, Lee K (2016) Learning temporal features using a deep neural network and its application to music genre classification. In: Ismir pp. 434–440. https://wp.nyu.edu/ismir2016/wp-content/uploads/sites/2294/2016/07/159_Paper.pdf
Senac C, Pellegrini T, Mouret F, Pinquier J (2017) Music feature maps with convolutional neural networks for music genre classification. In: Proceedings of the 15th international workshop on content-based multimedia indexing pp. 1–5. https://doi.org/10.1145/3095713.3095733
Yu Y, Luo S, Liu S, Qiao H, Liu Y, Feng L (2020) Deep attention based music genre classification. Neurocomputing 372:84–91. https://doi.org/10.1016/j.neucom.2019.09.054
Aguiar RL, Costa YM, Silla CN (2018) Exploring data augmentation to improve music genre classification with convnets. In: 2018 International joint conference on neural networks (IJTHE CNN), IEEE pp. 1–8. https://doi.org/10.1109/IJCNN.2018.8489166
Yang R, Feng L, Wang H, Yao J, Luo S (2020) Parallel recurrent convolutional neural networks-based music genre classification method for mobile devices. IEEE Access 8:19629–19637. https://doi.org/10.1109/ACCESS.2020.2968170
Zhang W, Lei W, Xu X, Xing X (2016) Improved music genre classification with convolutional neural networks. In: Interspeech pp. 3304–3308. https://www.isca-speech.org/archive_v0/Interspeech_2016/pdfs/1236.PDF
Liu J, Wang C, Zha L (2021) A middle-level learning feature interaction method with deep learning for multi-feature music genre classification. Electronics 10(18):2206. https://doi.org/10.3390/electronics10182206
Rajanna AR, Aryafar K, Shokoufandeh A, Ptucha R (2015) Deep neural networks: a case study for music genre classification. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA), IEEE pp. 655–660. https://doi.org/10.1109/ICMLA.2015.160
Shi L, Li C, Tian L (2019) Music genre classification based on chroma features and deep learning. In: 2019 Tenth international conference on intelligent control and information processing (ICICIP), IEEE pp. 81–86. https://doi.org/10.1109/ICICIP47338.2019.9012215
Elbir A, Çam HB, Iyican ME, Öztürk B, Aydin N (2018). Music genre classification and recommendation by using machine learning techniques. In: 2018 Innovations in intelligent systems and applications conference (ASYU), IEEE pp. 1–5. https://doi.org/10.1109/ASYU.2018.8554016
Tsaptsinos A (2017) Lyrics-based music genre classification using a hierarchical attention network. arXiv preprint arXiv:1707.04678
Panagakis Y, Kotropoulos CL, Arce GR (2014) Music genre classification via joint sparse low-rank representation of audio features. IEEE/ACM Trans Audio Speech Lang Process 22(12):1905–1917. https://doi.org/10.1109/TASLP.2014.2355774
Lykartsis A, Lerch A (2015) Beat histogram features for rhythm-based musical genre classification using multiple novelty functions. In: 18th International conference on digital audio effects. Trondheim, Norway, pp.1–8. https://musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/12/DAFx-15_submission_42-1.pdf
http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html, accessed on Sep 2021
https://www.kaggle.com/andradaolteanu/gtzan-dataset-music-genre-classification, accessed on Sep 2021
Shah M, Pujara N, Mangaroliya K, Gohil L, Vyas T, Degadwala S (2022) Music genre classification using deep learning. In: 2022 6th International conference on computing methodologies and communication (ICCMC), IEEE pp. 974–978. https://doi.org/10.1109/ICCMC53470.2022.9753953
Hongdan W, SalmiJamali S, Zhengping C, Qiaojuan S, Le R (2022) An intelligent music genre analysis using feature extraction and classification using deep learning techniques. Comput Elect Eng 100:107978. https://doi.org/10.1016/j.compeleceng.2022.107978
Falola PB, Alabi EO, Ogunajo FT, Fasae OD (2022) Music genre classification using machine and deep learning techniques: a review. ResearchJet J Anal Invent 3(03):35–50
Singh Y, Biswas A (2022) Robustness of musical features on deep learning models for music genre classification. Expert Syst Appl 199:116879. https://doi.org/10.1016/j.eswa.2022.116879
Wang W, Sohail M (2022) Research on music style classification based on deep learning. Comput Math Methods Med 2022:1–8. https://doi.org/10.1155/2022/3699885
Narkhede, N., Mathur, S., & Bhaskar, A. (2022). Machine learning techniques for music genre classification. In: Information and communication technology for competitive strategies (ICTCS 2020). Springer, Singapore pp. 155–161. https://doi.org/10.1007/978-981-16-0739-4_15
Gupta R, Ashish S, Shekhar H, Dominic MS (2022) Music genre classification using CNN and RNN-LSTM. In: Micro-electronics and telecommunication engineering. Springer, Singapore
Acknowledgements
We want to thank Department of CSE, PMEC Berhampur (Government), India for providing adequate infrastructure and facilities to conduct this research work. We also want to thank Sonalisha Mohapatra for her overall coordination with group members Anup Pradhan, Subhankar Dash, and Subham Kumar Biswal of Department of CSE, PMEC Berhampur for supporting in completion of this research work under guidance of Dr. Kalyan Kumar Jena and Dr. Sourav Kumar Bhoi.
Funding
There is no funding information.
Author information
Authors and Affiliations
Contributions
KKJ, SKB and SM contributed equally to this whole work. SB contributed in the design of hybrid model and simulations.
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent to publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jena, K.K., Bhoi, S.K., Mohapatra, S. et al. A hybrid deep learning approach for classification of music genres using wavelet and spectrogram analysis. Neural Comput & Applic 35, 11223–11248 (2023). https://doi.org/10.1007/s00521-023-08294-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08294-6