SVM-Enabled Voice Activity Detection

  • Javier Ramírez
  • Pablo Yélamos
  • Juan Manuel Górriz
  • Carlos G. Puntonet
  • José C. Segura
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3972)


Detecting the presence of speech in a noisy signal is an unsolved problem affecting numerous speech processing applications. This paper shows an effective method employing support vector machines (SVM) for voice activity detection (VAD) in noisy environments. The use of kernels in SVM enables to map the data into some other dot product space (called feature space) via a nonlinear transformation. The feature vector includes the subband signal-to-noise ratios of the input speech and a radial basis function (RBF) kernel is used as SVM model. It is shown the ability of the proposed method to learn how the signal is masked by the acoustic noise and to define an effective non-linear decision rule. The proposed approach shows clear improvements over standardized VADs for discontinuous speech transmission and distributed speech recognition, and other recently reported VADs.


Support Vector Machine Radial Basis Function False Alarm Rate Support Vector Machine Model Voice Activity Detector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Vapnik, V.: Estimation of Dependences Based on Empirical Data. Springer, New York (1982)MATHGoogle Scholar
  2. 2.
    Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, Inc., New York (1998)MATHGoogle Scholar
  3. 3.
    Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Advances in Kernel Methods - Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)Google Scholar
  4. 4.
    Clarkson, P., Moreno, P.: On the Use of Support Vector Machines for Phonetic Classification. In: Proc. of the IEEE Int. Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 585–588 (1999)Google Scholar
  5. 5.
    Ganapathiraju, A., Hamaker, J., Picone, J.: Applications of Support Vector Machines to Speech Recognition. IEEE Transactions on Signal Processing 52, 2348–2355 (2004)CrossRefGoogle Scholar
  6. 6.
    Chang, C., Lin, C.J.: LIBSVM: A Library for Support Vector Machines. Technical report, Dept. of Computer Science and Information Engineering, National Taiwan University (2001)Google Scholar
  7. 7.
    Moreno, A., Borge, L., Christoph, D., Gael, R., Khalid, C., Stephan, E., Jeffrey, A.: SpeechDat-Car: A Large Speech Database for Automotive Environments. In: Proceedings of the II LREC Conference (2000)Google Scholar
  8. 8.
    Woo, K., Yang, T., Park, K., Lee, C.: Robust Voice Activity Detection Algorithm for Estimating Noise Spectrum. Electronics Letters 36, 180–181 (2000)CrossRefGoogle Scholar
  9. 9.
    Li, Q., Zheng, J., Tsai, A., Zhou, Q.: Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition. IEEE Transactions on Speech and Audio Processing 10, 146–157 (2002)CrossRefGoogle Scholar
  10. 10.
    Marzinzik, M., Kollmeier, B.: Speech Pause Detection for Noise Spectrum Estimation by Tracking Power Envelope Dynamics. IEEE Transactions on Speech and Audio Processing 10, 341–351 (2002)CrossRefGoogle Scholar
  11. 11.
    Sohn, J., Kim, N.S., Sung, W.: A Statistical Model-Based Voice Activity Detection. IEEE Signal Processing Letters 16, 1–3 (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Javier Ramírez
    • 1
  • Pablo Yélamos
    • 1
  • Juan Manuel Górriz
    • 1
  • Carlos G. Puntonet
    • 2
  • José C. Segura
    • 1
  1. 1.Dept. of Signal Theory, Networking and CommunicationsUniversity of GranadaSpain
  2. 2.Dept. of Architecture and Computer TechnologyUniversity of GranadaSpain

Personalised recommendations