Abstract
In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (LSAP) based on the TE of noisy speech as a feature parameter for voice activity detection (VAD) in each frequency subband, rather than conventional LSAP. Results show that the TE operator can enhance the ability to discriminate speech and noise and further suppress noise components. Therefore, TE-based LSAP provides a better representation of LSAP, resulting in improved VAD for estimating noise power in a speech enhancement algorithm. In addition, the presented method utilizes TE-based global SAP (GSAP) derived in each frame as the weighting parameter for modifying the adopted TE operator and improving its performance. The proposed algorithm was evaluated by objective and subjective quality tests under various environments, and was shown to produce better results than the conventional method.
Similar content being viewed by others
References
TIA/EIA/IS-127. Enhanced variable rate codec, speech service option 3 for wideband spread spectrum digital systems [R]. Eqglewood: TIA, 1996.
Ephraim Y, Malah D. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator [J]. IEEE Trans Acoust Speech Signal Process, 1984, 32(6): 1109–1121.
JEON Yu-yong, LEE Sang-min. A speech enhancement algorithm to reduce noise and compensate for partial masking effect [J]. Journal of Central South University of Technology, 2011, 18(4): 1121–1127.
MCAUALY R J, MALPASS M L. Speech enhancement using a soft-decision noise suppression filter [J]. IEEE Trans Acoust Speech Signal Process, 1980, 28: 137–145.
KIM N S, CHANG J H. Spectral enhancement based on global soft decision [J]. IEEE Signal Processing Letters, 2000, 7(5): 108–110.
KARRAY L, MOKBEL C, MONNE J. Solutions for robust speech/non-speech detection in wireless environment [C]// Proceedings of IVTTA, Torino, 1988: 166–170.
RABINER L R, SAMBUR M R. Voiced-unvoiced-silence detection using the Itakura LPC distance measure [C]// Proc IEEE Int Conf Acoust Speech Signal Process, Hartford, 1977: 323–326.
SOHN J, KIM N S, SUNG W. A statistical model-based voice activity detection [J]. IEEE Signal Processing Letters, 1999, 6(1): 1–3.
SOHN J, SUNG W. A voice activity detector employing soft decision based noise spectrum adaptation [C]// Proc. IEEE Int Conf Acoustics, Speech, and Signal Processing, Seattle, 1998: 365–368.
JABLOUN F, CETIN A E, ERZIN E. Teager energy based feature parameters for speech recognition in car noise [J]. IEEE Signal Processing Letters, 1999, 6: 259–261.
WANG K C, TSAI Y H. Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy [C]// Second International Symposium on Universal Communication 2008, Osaka, 2008: 423–428.
CHEN S H, WU H T, CHANG Y, TRUONG T K. Robust voice activity detection using perceptual wavelet-packet transform and Teager energy operator [J]. Pattern Recognition Letters, 2007, 28(11): 1327–1332.
EVANGELOPOULOS G, MARAGOS P. Multiband modulation energy tracking for noisy speech detection [J]. IEEE Trans ASLP, 2006, 14(6): 2024–2038.
HU Yi, LOIZOU P C. Evaluation of objective quality measures for speech enhancement [J]. IEEE Trans ASLP, 2008, 16: 229–238.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Project supported by Inha University Research Grant; Project(10031764) supported by the Strategic Technology Development Program of Ministry of Knowledge Economy, Korea
Rights and permissions
About this article
Cite this article
Park, Ys., Lee, Sm. Speech enhancement through voice activity detection using speech absence probability based on Teager energy. J. Cent. South Univ. 20, 424–432 (2013). https://doi.org/10.1007/s11771-013-1503-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11771-013-1503-1