Improved speech absence probability estimation based on environmental noise classification

Son, Young-ho; Lee, Sang-min

doi:10.1007/s11771-012-1309-6

Improved speech absence probability estimation based on environmental noise classification

Published: 08 September 2012

Volume 19, pages 2548–2553, (2012)
Cite this article

Journal of Central South University Aims and scope Submit manuscript

Young-ho Son¹ &
Sang-min Lee^1,2

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement. A relevant noise estimation approach, known as the speech presence uncertainty tracking method, requires seeking the “a priori” probability of speech absence that is derived by applying microphone input signal and the noise signal based on the estimated value of the “a posteriori” signal-to-noise ratio (SNR). To overcome this problem, first, the optimal values in terms of the perceived speech quality of a variety of noise types are derived. Second, the estimated optimal values are assigned according to the determined noise type which is classified by a real-time noise classification algorithm based on the Gaussian mixture model (GMM). The proposed algorithm estimates the speech absence probability using a noise classification algorithm which is based on GMM to apply the optimal parameter of each noise type, unlike the conventional approach which uses a fixed threshold and smoothing parameter. The performance of the proposed method was evaluated by objective tests, such as the perceptual evaluation of speech quality (PESQ) and composite measure. Performance was then evaluated by a subjective test, namely, mean opinion scores (MOS) under various noise environments. The proposed method show better results than existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

A Review on Sound Source Localization Systems

Article 05 May 2022

References

EPHARIM Y, MALAH D. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator [J]. IEEE Trans Acoust, Speech, Signal Process, 1985, ASSP-32(2): 443–445.
Article Google Scholar
SOHN J, KIM N S, SUNG W. A statistical model-based voice activity detection [J]. IEEE Signal Processing Letters, 1999, 6(1): 1–3.
Article Google Scholar
BOLL S F. Suppression of acoustic noise in speech using spectral subtraction [J]. IEEE Trans Acoust, Speech, Signal Process, 1979, ASSP-27(2): 113–120.
Article Google Scholar
LIM J S, OPPENHEIM A V. Enhancement and bandwidth compression of noisy speech [J]. IEEE Trans Acoust, Speech, Signal Process, 1979, ASSP-67(12): 1583–1604.
Google Scholar
GOMEZ R, KAWAHARA T. Optimizing spectral subtraction and wiener filtering for robust speech recognition in reverberant and noisy conditions [C]// Proc ICASSP. Dallax, TX, USA, 2010: 4566–4569.
MCAUALY R J, MALPASS M L. Speech enhancement using a soft-decision noise suppression filter [J]. IEEE Trans Acoust, Speech, Signal Processing, 1980, 28(2): 137–145.
Article Google Scholar
SCALART P, FILHO J W. Speech enhancement based on a priori signal to noise estimation [C]. Proc ICASSP. Atlanta, GA, USA, 1996: 629–632.
KIM N S, CHANG J H. Spectral enhancement based on global soft decision [J]. IEEE Signal Processing Letters, 2000, 7(5): 108–110.
Article Google Scholar
EPHRAIM Y, MALAH D. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator [J]. IEEE Trans Acoust, Speech, Signal Process, 1984, ASSP-32(6): 1109–1121.
Article Google Scholar
MALAH Y D, CPX R, ACCARDI A. Tracking speech presence uncertainty to improve speech enhancement in non-stationary noise environments [C]// Proc IEEE Int Conf Acoustics Speech and Signal Processing. Phoenix, AZ, USA, 1999: 789–792.
XUAN G, ZHANG W, CHAI P. EM algorithm of Gaussian mixture model and hidden Markov model [C]. Proc IEEE International Conference on Image Processing. Thessaloniki, 2001: 145–148.
REYNOLDS D A, QUATIERI T F, DUNN R B. Speaker verification using adapted Gaussian mixture models [J]. Digital Signal Processing, 2000, 10: 19–41.
Article Google Scholar
SEOKHWAN Jo, CHANG D YOO. Psychoacoustically constrained and distortion minimized speech enhancement [J]. IEEE Transactions on Audio Speech and Language Processing, 2010, 18(8): 2099–2110.
Article Google Scholar
ITU-T P.862. Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs [R]. 2001.
HU Y, LOIZOU P C. Evaluation of objective quality measures for speech enhancement [J]. IEEE Transactions on Audio, Speech and Language Processing, 2008, 16(1): 229–238.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Inha University, Incheon, 402-751, Korea
Young-ho Son & Sang-min Lee
Institute for Information and Electronics Research, Inha University, Incheon, 402-751, Korea
Sang-min Lee

Authors

Young-ho Son
View author publications
You can also search for this author in PubMed Google Scholar
Sang-min Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sang-min Lee.

Additional information

Foundation item: Project supported by an Inha University Research Grant; Project(10031764) supported by the Strategic Technology Development Program of Ministry of Knowledge Economy, Korea

Rights and permissions

Reprints and permissions

About this article

Cite this article

Son, Yh., Lee, Sm. Improved speech absence probability estimation based on environmental noise classification. J. Cent. South Univ. 19, 2548–2553 (2012). https://doi.org/10.1007/s11771-012-1309-6

Download citation

Received: 25 October 2011
Accepted: 29 February 2012
Published: 08 September 2012
Issue Date: September 2012
DOI: https://doi.org/10.1007/s11771-012-1309-6

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved speech absence probability estimation based on environmental noise classification

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

A Review on Sound Source Localization Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Improved speech absence probability estimation based on environmental noise classification

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

A Review on Sound Source Localization Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation