Speech enhancement through voice activity detection using speech absence probability based on Teager energy

Park, Yun-sik; Lee, Sang-min

doi:10.1007/s11771-013-1503-1

Speech enhancement through voice activity detection using speech absence probability based on Teager energy

Published: 07 February 2013

Volume 20, pages 424–432, (2013)
Cite this article

Journal of Central South University Aims and scope Submit manuscript

Yun-sik Park¹ &
Sang-min Lee^1,2

223 Accesses
5 Citations
Explore all metrics

Abstract

In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (LSAP) based on the TE of noisy speech as a feature parameter for voice activity detection (VAD) in each frequency subband, rather than conventional LSAP. Results show that the TE operator can enhance the ability to discriminate speech and noise and further suppress noise components. Therefore, TE-based LSAP provides a better representation of LSAP, resulting in improved VAD for estimating noise power in a speech enhancement algorithm. In addition, the presented method utilizes TE-based global SAP (GSAP) derived in each frame as the weighting parameter for modifying the adopted TE operator and improving its performance. The proposed algorithm was evaluated by objective and subjective quality tests under various environments, and was shown to produce better results than the conventional method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

Article Open access 25 October 2023

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

Article Open access 17 April 2024

References

TIA/EIA/IS-127. Enhanced variable rate codec, speech service option 3 for wideband spread spectrum digital systems [R]. Eqglewood: TIA, 1996.
Google Scholar
Ephraim Y, Malah D. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator [J]. IEEE Trans Acoust Speech Signal Process, 1984, 32(6): 1109–1121.
Article Google Scholar
JEON Yu-yong, LEE Sang-min. A speech enhancement algorithm to reduce noise and compensate for partial masking effect [J]. Journal of Central South University of Technology, 2011, 18(4): 1121–1127.
Article Google Scholar
MCAUALY R J, MALPASS M L. Speech enhancement using a soft-decision noise suppression filter [J]. IEEE Trans Acoust Speech Signal Process, 1980, 28: 137–145.
Article Google Scholar
KIM N S, CHANG J H. Spectral enhancement based on global soft decision [J]. IEEE Signal Processing Letters, 2000, 7(5): 108–110.
Article Google Scholar
KARRAY L, MOKBEL C, MONNE J. Solutions for robust speech/non-speech detection in wireless environment [C]// Proceedings of IVTTA, Torino, 1988: 166–170.
RABINER L R, SAMBUR M R. Voiced-unvoiced-silence detection using the Itakura LPC distance measure [C]// Proc IEEE Int Conf Acoust Speech Signal Process, Hartford, 1977: 323–326.
SOHN J, KIM N S, SUNG W. A statistical model-based voice activity detection [J]. IEEE Signal Processing Letters, 1999, 6(1): 1–3.
Article Google Scholar
SOHN J, SUNG W. A voice activity detector employing soft decision based noise spectrum adaptation [C]// Proc. IEEE Int Conf Acoustics, Speech, and Signal Processing, Seattle, 1998: 365–368.
JABLOUN F, CETIN A E, ERZIN E. Teager energy based feature parameters for speech recognition in car noise [J]. IEEE Signal Processing Letters, 1999, 6: 259–261.
Article Google Scholar
WANG K C, TSAI Y H. Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy [C]// Second International Symposium on Universal Communication 2008, Osaka, 2008: 423–428.
CHEN S H, WU H T, CHANG Y, TRUONG T K. Robust voice activity detection using perceptual wavelet-packet transform and Teager energy operator [J]. Pattern Recognition Letters, 2007, 28(11): 1327–1332.
Article Google Scholar
EVANGELOPOULOS G, MARAGOS P. Multiband modulation energy tracking for noisy speech detection [J]. IEEE Trans ASLP, 2006, 14(6): 2024–2038.
Google Scholar
HU Yi, LOIZOU P C. Evaluation of objective quality measures for speech enhancement [J]. IEEE Trans ASLP, 2008, 16: 229–238.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Inha University, Incheon, 402-751, Korea
Yun-sik Park & Sang-min Lee
Institute for Information and Electronics Research, Inha University, Incheon, 402-751, Korea
Sang-min Lee

Authors

Yun-sik Park
View author publications
You can also search for this author in PubMed Google Scholar
Sang-min Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sang-min Lee.

Additional information

Foundation item: Project supported by Inha University Research Grant; Project(10031764) supported by the Strategic Technology Development Program of Ministry of Knowledge Economy, Korea

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, Ys., Lee, Sm. Speech enhancement through voice activity detection using speech absence probability based on Teager energy. J. Cent. South Univ. 20, 424–432 (2013). https://doi.org/10.1007/s11771-013-1503-1

Download citation

Received: 03 November 2011
Accepted: 02 June 2012
Published: 07 February 2013
Issue Date: February 2013
DOI: https://doi.org/10.1007/s11771-013-1503-1

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech enhancement through voice activity detection using speech absence probability based on Teager energy

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Speech enhancement through voice activity detection using speech absence probability based on Teager energy

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

An ensemble of optimal smoothing and minima controlled through iterative averaging for speech enhancement under uncontrolled environment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation