A Robust VAD Method for Array Signals

  • Xiaohong Ma
  • Jin Liu
  • Fuliang Yin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3972)


A new voice activity detection (VAD) method for microphone array signals is developed in this paper. A relatively pure speech signal can be obtained by applying noise canceling algorithms on some signals from microphone array. For suppressing correlated and uncorrelated noises, the proposed method doesn’t perform the same processing, but analyze the natures of the background noises by calculating the correlation between the noisy signals during silence intervals firstly. If the additive noises are correlated, relatively pure speech component is separated by blind source separation (BSS) method. Otherwise, this speech component is estimated by beamforming and maximum a posterior (MAP) algorithm. Then, a voice activity detection method based on entropy is employed to determine whether this relatively pure speech signal is active or not. Finally, this VAD result is used as reference to produce those of all array signals. Simulation results illustrate the validity of the proposed method.


Speech Signal Active Speech Blind Source Separation Noisy Signal Channel Signal 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gustafsson, T., Rao, B.D., Trivedi, M.: Source Localization in Reverberant Environments: Modeling and Statistical Analysis. IEEE Trans. Speech and Audio Process 11, 791–803 (2003)CrossRefGoogle Scholar
  2. 2.
    Gannot, S., Cohen, I.: Speech Enhancement Based on the General Transfer Function GSC and Postfiltering. IEEE Trans. Speech and Audio Process 12, 561–571 (2004)CrossRefGoogle Scholar
  3. 3.
    Ramírez, J., Segura, J.C., Benítez, C., et al.: Efficient Voice Activity Detection Algorithms Using Long-term Speech Information. Speech Communication 42, 271–287 (2004)CrossRefGoogle Scholar
  4. 4.
    Xu, W., Ding, Q., Wang, B.X.: A Speech Endpoint Detector Based on Eigenspaceenergy-entropy (in Chinese). Journal of China Institute of Communications 24, 125–132 (2003)Google Scholar
  5. 5.
    Hyvarinen, A., Oja, E.: A Fast Fixed-point Algorithm for Independent Component Analysis. Neural Computation 9, 1483–1492 (1997)CrossRefGoogle Scholar
  6. 6.
    Zhong, M.J., Tang, H.W., Chen, H.J., Tan, Y.Y.: An EM Algorithm for Learning Sparse and Overcomplete Representation. Neurocomputing 57, 469–476 (2004)CrossRefGoogle Scholar
  7. 7.
    Knapp, C., Carter, G.: The Generalized Correlation Method for Estimation of Time Delay. IEEE Trans. Acoustics, Speech, and Signal Process 24, 320–327 (1976)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xiaohong Ma
    • 1
  • Jin Liu
    • 1
  • Fuliang Yin
    • 1
  1. 1.School of Electronic and Information EngineeringDalian University of TechnologyDalianChina

Personalised recommendations