Information Rate for Fast Time-Domain Instrument Classification
- 841 Downloads
In this paper, we propose a novel feature set for instrument classification which is based on the information rate of the signal in the time domain. The feature is extracted by calculating the Shannon entropy over a sliding short-time energy frame and binning statistical features into a unique feature vector. Experimental results are presented, including a comparison to frequency-domain feature sets. The proposed entropy features are shown to be faster than popular frequency-domain methods while maintaining comparable accuracy in an instrument classification task.
KeywordsAudio classification Audio features Audio signal processing Time-domain methods
- 1.Altaf, M., Juang, B.: Audio signal classification with temporal envelopes. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 469–472, May 2011Google Scholar
- 2.Delgado-Contreras, J., Garcia-Vazquez, J.: Classification of environmental audio signals using statistical time and frequency features. In: International Conference on Electronics, Communications and Computers (CONIELECOMP), pp. 212–216 (2014)Google Scholar
- 5.Eronen, A.: Comparison of features for musical instrument recognition. In: IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, pp. 19–22 (2001)Google Scholar
- 7.Ibarrola, A., Chavez, E.: A robust entropy-based audio-fingerprint. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1729–1732, July 2006Google Scholar
- 8.Lambrou, T., Kudumakis, P., Speller, R., Sandler, M., Linney, A.: Classification of audio signals using statistical features on time and wavelet transform domains. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 6, pp. 3621–3624, May 1998Google Scholar
- 9.Nielsen, A., Sigurdsson, S., Hansen, L., Arenas-Garcia, J.: On the relevance of spectral features for instrument classification. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 485–488, April 2007Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.