Skip to main content
Log in

Combination of K-Means Clustering and Support Vector Machine for Instrument Detection

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

K-Means clustering and SVM (support vector machine) are both very different methods of classification. The purpose of the work discussed in this paper is to detect the played musical instrument, separately using K-Means clustering and SVM for various levels of clustering and classification. The research was started by detecting the onset in the audio signal to get the instances where the instrument(s) are played and then segregating them depending on the played instrument. The Mel frequency Cepstral Coefficients (MFCCs) are then collected and eventually graded to detect the instrument used with the assistance of K-Means and SVM. Finally, the results obtained by individual SVM and K-Means clustering are constantly compared to obtain more accurate result. It is assumed at the end that the difference in results is due to the fundamental difference between how SVM and K-Means function and identify the instances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Li P, Qian J, Wang T. Automatic instrument recognition in polyphonic music using convolutional neural networks. arXiv:1511.05520, 2015.

  2. Takahashi Y, Kondo K. Comparison of two classification methods for musical instrument identification. In: IEEE 3rd Global Conference on Consumer Electronics (GCCE), 2014; pp. 67–68.

  3. Goto M, Hayamizu S. A real-time music scene description system: detecting melody and bass lines in audio signals. In: IJCAI-99 workshop on computational auditory scene analysis, Stockholm, 1999; pp. 31–40.

  4. Bosch JJ, Janer J, Fuhrmann F, Herrera P. A comparison of sound segregation techniques for predominant instrument recognition in musical audio signals. In: Proc. ISMIR (pp. 559–564), 2012.

  5. Vaseghi SV, Rayner PJW. Detection and suppression of impulsive noise in speech communication systems. IEE Proc I- Commun Speech Vis. 1990;137(1):38–46.

    Article  Google Scholar 

  6. Goto M, Muraoka Y. Beat tracking based on multiple-agent architecture̵a real-time beat tracking system for audio signals. In: Proc. 2nd Int. Conf. Multiagent Systems, 1996;pp. 103–110.

  7. Kubera E, Wieczorkowska A, Skrzypiec M. Influence of feature setson precision, recall, and accuracy of identification of musical instruments in audio recordings. InISMIS, pp. 204–213. Springer, 2014.

  8. Rafii Z, Pardo B. Music/voice separation using the similarity matrix. In: Proc. 13th Int. Soc. for Music Info. Retrieval Conf., Porto, 2012, pp. 583–588.

  9. Zapat J, Gomez E. Using voice suppression algorithms to improve beat tracking in the presence of highly predominant vocals. In: 2013 IEEE international conference on acoustics, speech and signal processing.

  10. Hainsworth S, Malcolm M. Onset detection in musical audio signals. In: Int. Computer Music Conf. (ICMC), Singapore, 2003;pp. 136–166.

  11. Bhalke D, Rama Rao C, Bormane D. Automatic musical instrument classification using fractional fourier transform based-mfcc features and counter propagation neural network. J Intell Inf Syst. 2015;1–22.

  12. Mel Frequency Cepstral Coefficients for Music Modeling- Beth Logan Cambridge Research Laboratory. http://musicweb.ucsd.edu/~sdubnov/CATbox/Reader/logan00mel.pdf Accessed 5 November 2019.

  13. Molau S, Pitz M, Schluter R, Ney H. Computing Mel-frequency cepstral coefficients on the power spectrum. In: IEEE conference on acoustics, speecch, and signak processing, Salt Lake City, United States of America, 2001.

  14. Fang Z, Guoliang Z, Zhanjiang S. Comparison of different implementations of MFCC. J Comput Sci Technol. p. 1. Accessed 5 November 2019.

  15. Lohne M. The computational complexity of the fast Fourier transform. https://folk.uio.no/mathialo/texts/fftcomplexity.pdf. Accessed on 5 November 2019.

  16. Gupta S, Jaafar J, Wan Ahmad F, Bansal A. Feature extraction using_MFCC. https://archive.is/20130414065947/http://asadl.org/jasa/resource/1/jasman/v8/i3/p185_s1. Accessed on 5 November 2019.

  17. FitzGerald D., Jaiswal R.: ‘Improved Stereo Instrumental Track Recovery using Median Nearest-Neighbour Inpainting’, 24th IET Irish Signals and Systems Conference (ISSC 2013), 2013 page

  18. Klapuri A. Musical instrument recognition in polyphonic audio using missing feature approach. IEEE Trans Audio Speech Lang Process. 2013;21(9):1805–17.

    Article  Google Scholar 

  19. Castellengo M. Acoustical analysis of initial transients in flute-like instruments. Acta Acustica. 1999;85(3):387–400.

    Google Scholar 

  20. Herrera-Boyer P, Peeters G, Dubnov S. Automatic classification of musical instrument sounds. J New Music Res. 2003;32(1).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shweta B. Thomas.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pandey, A., Nair, T.R. & Thomas, S.B. Combination of K-Means Clustering and Support Vector Machine for Instrument Detection. SN COMPUT. SCI. 3, 121 (2022). https://doi.org/10.1007/s42979-021-01011-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-01011-x

Keywords

Navigation