Music Detection Using Deep Learning with Tensorflow

Chikkamath, Satish; Nirmala, S. R.

doi:10.1007/978-981-16-3690-5_25

Satish Chikkamath³⁹ &
S. R. Nirmala³⁹

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 783))

1420 Accesses
4 Citations

Abstract

Music is an expression through collection of harmonic frequencies whose medium is sound. Group of these frequencies will consist of various elements that create music or non music expression. The main objective of the work carried out is to detect the presence of music in a given audio file using the concept of transfer learning. The literature proves that music detection in an audio file can be done by extracting handcrafted audio features like (ZCR, entropy, AMR, LSTER) and train by using classifiers like SVM, Random forest. The advances in machine learning and deep learning architectures have opened the new path for music detection. End to end classification system performs feature extraction and classification jointly this process may lead to extract new unknown feature and contribute to improve the overall accuracy of the system, however to train the CNN networks from scratch we need huge dataset and its time consuming, hence the need of transfer learning ascends. We have used a tensor flow VGGish model released by google as feature extractor which is trained on Audioset data from YouTube videos and finally trained LSTM (Long short term memory) network, a special kind of RNN for classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 449.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chikkamath S, Shet R, Praveen P, Nalini CI, Kotturshettar BB (2020) Effective utilization of maker space for facilitating product realization course. J Eng Educ Transform 33(3):37–42
Article Google Scholar
Purwins H, Li B, Virtanen T, Schlüter J, Chang S-Y, Sainath T (2019) Deep learning for audio signal processing. IEEE J Sel Top Sig Process 13(2):206–219
Article Google Scholar
Le Coz M, Pinquier J, André-Obrecht R, Mauclair J (2013) Audio indexing including frequency tracking of simultaneous multiple sources in speech and music. In: 2013 11th international workshop on content-based multimedia indexing (CBMI). IEEE, pp 23–28
Google Scholar
Kiranyaz S, Qureshi AF, Gabbouj M (2006) A generic audio classification and segmentation approach for multimedia indexing and retrieval. IEEE Trans Audio Speech Lang Process 14(3):1062–1081
Article Google Scholar
Zahid S, Hussain F, Rashid M, Yousaf MH, Habib HA (2015) Optimized audio classification and segmentation algorithm by using ensemble methods. Math Probl Eng 2015, Article ID 209814:11. https://doi.org/10.1155/2015/209814
Khonglah BK, Sharma R, Prasanna SRM (2015) Speech vs music discrimination using empirical mode decomposition. In: 2015 twenty first national conference on communications (NCC). IEEE, pp 1–6
Google Scholar
Wu Q, Yan Q, Deng H, Wang J (2010) A combination of data mining method with decision trees building for speech/music discrimination. Comput Speech Lang 24(2):257–272
Article Google Scholar
Nilufar S, Ray N, Molla MKI, Hirose K (2012) Spectrogram based features selection using multiple kernel learning for speech/music discrimination. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 501–504
Google Scholar
George T (2002) Manipulation, analysis and retrieval systems for audio signals. Princeton University Princeton, NJ, USA
Google Scholar
Dai J, Shan L, Wei X, Ni C, Liu W (2016) Long short-term memory recurrent neural network based segment features for music genre classification. In: 2016 10th international symposium on Chinese spoken language processing (ISCSLP). IEEE, pp 1–5
Google Scholar

Download references

Author information

Authors and Affiliations

KLE Technological University, Hubballi, Karnataka, India
Satish Chikkamath & S. R. Nirmala

Authors

Satish Chikkamath
View author publications
You can also search for this author in PubMed Google Scholar
S. R. Nirmala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Satish Chikkamath .

Editor information

Editors and Affiliations

BioAxis DNA Research Centre Private Ltd., Hyderabad, Telangana, India
Amit Kumar
Department of Computer Engineering, Electrical Engineering and Applied Mathematics, University of Salerno, Fisciano, Salerno, Italy
Sabrina Senatore
Department of Computer Science and Engineering, CMR Institute of Technology, Hyderabad, Telangana, India
Vinit Kumar Gunjan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chikkamath, S., Nirmala, S.R. (2022). Music Detection Using Deep Learning with Tensorflow. In: Kumar, A., Senatore, S., Gunjan, V.K. (eds) ICDSMLA 2020. Lecture Notes in Electrical Engineering, vol 783. Springer, Singapore. https://doi.org/10.1007/978-981-16-3690-5_25

Download citation

DOI: https://doi.org/10.1007/978-981-16-3690-5_25
Published: 09 November 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3689-9
Online ISBN: 978-981-16-3690-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Music Detection Using Deep Learning with Tensorflow