Silence/Speech Detection Method Based on Set of Decision Graphs

  • Jan Trmal
  • Jan Zelinka
  • Jan Vaněk
  • Luděk Müller
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4188)


In the paper we demonstrate a complex supervised learning method based on a binary decision graphs. This method is employed in construction of a silence/speech detector. Performance of the resulting silence/speech detector is compared with performance of common silence/speech detectors used in telecommunications and with a detector based on HMM and a bigram silence/speech language model. Each non-leaf node of a decision graph has assigned a question and a sub-classifier answering this question. We test three kinds of these sub-classifiers: linear classifier, classifier based on separating quadratic hyper-plane (SQHP), and Support Vector Machines (SVM) based classifier. Moreover, besides usage of a single decision graph we investigate application of a set of binary decision graphs.


Speech Signal Gaussian Mixture Model Quadratic Programming Problem Speech Recognition System Voice Activity Detector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, New York (1999)Google Scholar
  2. 2.
    Platt, J.: Using Sparseness and Analytic QP to Speed Training of Support Vector Machines. In: Kearns, M.S., Solla, S.A., Cohn, D.A. (eds.) Advances in Neural Information Processing Systems 11. MIT Press, Cambridge (1999)Google Scholar
  3. 3.
    Voice Activity Detector for Adaptive Multi-Rate speech traffic channels, GSM 06.94 version 7.1.1 Release 1994 Telecommunications Standards Institute (1994)Google Scholar
  4. 4.
    AMR Wideband speech codec; Voice Activity Detector (VAD), 3GPP TS 26.194 version 6.0.0 Release 6. European Telecommunications Standards Institute (1994)Google Scholar
  5. 5.
    VAD for Coding of Speech at 8 kbit/s Using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP) – ITU-T Recommendation G.729 Annex BGoogle Scholar
  6. 6.
    Müller, L., Psutka, J.: Building robust PLP-based acoustic module for ASR applications. In: SPECOM 2005 proceedings, Moscow State Linguistic University, Moscow, pp. 761–764 (2005), ISBN 5-7452-0110-XGoogle Scholar
  7. 7.
    Radová, V., Psutka, J.: UWB_S01 Corpus: A Czech Read-Speech Corpus. In: Proceedings of the 6th International Conference on Spoken Language Processing ICSLP 2000, Beijing 2000, China. vol. IV, pp. 732–735 (2000)Google Scholar
  8. 8.
    Chu, W.C.: Speech Coding Algorithms: Foundation and Evolution of Standardized Coders. John Wiley and Sons, New Jersey (2003)MATHGoogle Scholar
  9. 9.
    Šmídl, L., Prcín, M., Jurčíček, F.: How to Detect Speech in Telephone Dialogue Systems. In: Proceedings of EURASIP Conference on Digital Signal Processing for Multimedia Communications and Services ECMCS, Hungary, Budapest (CD-ROM) (2001), ISBN 963-8111-64-XGoogle Scholar
  10. 10.
    Cornu, E., Sheikhzadeh, H., Brennan, R.L., Abutalebi, H.R., et al.: ETSI AMR-2 VAD: Evaluation and Ultra Low-Resource Implementation. In: International Conference on Acoustics Speech and Signal Processing (ICASSP 2003) (2003), Available at:

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jan Trmal
    • 1
  • Jan Zelinka
    • 1
  • Jan Vaněk
    • 1
  • Luděk Müller
    • 1
  1. 1.Department of CyberneticsUniversity of West BohemiaPlzeňCzech Republic

Personalised recommendations