Abstract
This paper presents an improved Voice Activity Detection (VAD) algorithm which uses the Signal-to-Noise Ratio (SNR) measure. We assume that noise Power Spectral Density (PSD) in each spectral bin follows a Rayleigh distribution. Rayleigh distributions with its asymmetric tail characteristics give a better description of the noise PSD distribution than Gaussian distribution. Under this assumption, a new threshold updating expression is derived. Since the analytical integral of the false alarm probability, the threshold updating expression can be represented without the inverse complementary error function and low computational complexity is achieved in our system. Experimental results show that the proposed VAD outperforms or at least is comparable with the VAD scheme presented by Davis under several noise environments and has a lower computational complexity.
Similar content being viewed by others
References
ITU-T Recommendation G.729, Annex B, 1996.
F. Beritelli, S. Casale, and A. Cavallaro. A robust voice activity detector for wireless communications using soft omputing. IEEE Journal on Selected Areas in Communications, 16(1998)9, 1818–1829.
S. Gazor and W. Zhang. A soft voice activity detector based on a Laplacian-Gasussian model. IEEE Transactions on Speech and Audio Processing, 11 (2003)5, 498–505.
J. H. Chang, J. W. Shin, and N. S. Kim. Likelihood ratio test with complex Laplacian model for voice activity detection. Proceedings of Eurospeech, Geneva, Switzerland, 2003, 1065–1068.
J. Sohn, N. S. Kim, and W. Sung. A statistical model-based voice activity detection. IEEE Signal Procesing Letters, 6(1999)1, 1–3.
A. Davis, S. Nordholm, and R. Togneri. Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold. IEEE Transactions on Audio, Speech, and Language Processing, 14(2006)2, 412–424.
C. Breithaupt and R. Martin. Voice activity detection in the DFT domain based on a parametric noise model. Procceeding of the International Workshop of Acoustic Echo and Noise Control (IWAENC), Paris, Sep. 2006.
A. Varga and H. J. M. Steeneken. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12(1993)3, 247–251.
J.-H. Chang, N. S. Kim, and S. K. Mitra. Voice activity detection based on multiple statistical models. IEEE Transactions on Signal Processing, 54(2006)6, 1965–1976.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China (No. 60874060).
Communication author: Tan Hongzhou, born in 1965, male, Ph.D., Professor.
About this article
Cite this article
Li, Y., Chen, J. & Tan, H. Voice Activity Detection under Rayleigh distribution. J. Electron.(China) 26, 552–556 (2009). https://doi.org/10.1007/s11767-008-0133-5
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11767-008-0133-5
Key words
- Statistical Voice Activity Detection (VAD)
- Threshold update
- Rayleigh distribution
- Computational complexity