TSD 2003: Text, Speech and Dialogue pp 275-280 | Cite as
Entropy and Dynamism Criteria for Speech and Audio Classification Applications
Abstract
We describe the audio classification system that uses entropy and dynamism criteria as discrimination features. The main idea of this approach is that the input neural net is considered as a informational channel. Channel tuned to the certain type of information transmits it best of all according to the informational criterion. In our case a multilayer perceptron (MLP) emitted posterior probabilities for speech recognition was used as such information channel. Then two features entropy and dynamism were computed using these posterior probabilities. And finally HMM was used as a classifier. Different experiments demonstrated efficient usage possibilities of entropy and dynamism criteria not only in speech music discrimination tasks but also in other application of audio classification.
Keywords
Posterior Probability Hide Markov Model Gaussian Mixture Model Sigmoid Activation Function Continuous Speech RecognitionPreview
Unable to display preview. Download preview PDF.
References
- 1.Jelinek, F.: Statistical Methods for Speech Recognition. The MIT Press, Cambridge (1999)Google Scholar
- 2.Aggarwal, G., Bajpai, A., Khan, A.N., Yegnanarayana, B.: Exploring Features for AudioIndexing. IRISS, Indian Institute of Science, Bangalore (2002)Google Scholar
- 3.Ajmera, J., McCovan, I., Bourland, H.: Robust Speech/Music Segmentation Using HMM. In: ICASSP (2002)Google Scholar
- 4.Williams, G., Ellis, D.: Speech/music discrimination based on posterior probability features. In: Proc. Eurospeech, Budapest (1999)Google Scholar
- 5.Morgan, N., Bourland, H.: Continuous Speech Recognition: An Introduction to the Hybrid HMM/Connectionist Approach. Signal Processing Magazine, 25–42 (1995)Google Scholar
- 6.Bovbel, E., Kheidorov, I., Chaikov, Y.: Wavelet-based biomedical signal processing using hidden Markov models. In: 4th BSI International Workshop, Italy, pp. 15–18 (2002)Google Scholar