TSD 2003: Text, Speech and Dialogue pp 275-280 | Cite as

Entropy and Dynamism Criteria for Speech and Audio Classification Applications

  • Igor E. Kheidorov
  • Hanna M. Lukashevich
  • Denis L. Mitrofanov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2807)

Abstract

We describe the audio classification system that uses entropy and dynamism criteria as discrimination features. The main idea of this approach is that the input neural net is considered as a informational channel. Channel tuned to the certain type of information transmits it best of all according to the informational criterion. In our case a multilayer perceptron (MLP) emitted posterior probabilities for speech recognition was used as such information channel. Then two features entropy and dynamism were computed using these posterior probabilities. And finally HMM was used as a classifier. Different experiments demonstrated efficient usage possibilities of entropy and dynamism criteria not only in speech music discrimination tasks but also in other application of audio classification.

Keywords

Posterior Probability Hide Markov Model Gaussian Mixture Model Sigmoid Activation Function Continuous Speech Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Jelinek, F.: Statistical Methods for Speech Recognition. The MIT Press, Cambridge (1999)Google Scholar
  2. 2.
    Aggarwal, G., Bajpai, A., Khan, A.N., Yegnanarayana, B.: Exploring Features for AudioIndexing. IRISS, Indian Institute of Science, Bangalore (2002)Google Scholar
  3. 3.
    Ajmera, J., McCovan, I., Bourland, H.: Robust Speech/Music Segmentation Using HMM. In: ICASSP (2002)Google Scholar
  4. 4.
    Williams, G., Ellis, D.: Speech/music discrimination based on posterior probability features. In: Proc. Eurospeech, Budapest (1999)Google Scholar
  5. 5.
    Morgan, N., Bourland, H.: Continuous Speech Recognition: An Introduction to the Hybrid HMM/Connectionist Approach. Signal Processing Magazine, 25–42 (1995)Google Scholar
  6. 6.
    Bovbel, E., Kheidorov, I., Chaikov, Y.: Wavelet-based biomedical signal processing using hidden Markov models. In: 4th BSI International Workshop, Italy, pp. 15–18 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Igor E. Kheidorov
    • 1
  • Hanna M. Lukashevich
    • 1
  • Denis L. Mitrofanov
    • 1
  1. 1.Department of RadiophysicsByelorussian State UniversityMinskBELARUS

Personalised recommendations