Analysis of Optimized Spectral Subtraction Method for Single Channel Speech Enhancement

Gupta, Monika; Singh, R. K.; Singh, Sachin

doi:10.1007/s11277-022-10039-y

Analysis of Optimized Spectral Subtraction Method for Single Channel Speech Enhancement

Published: 10 September 2022

Volume 128, pages 2203–2215, (2023)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

Monika Gupta¹,
R. K. Singh² &
Sachin Singh³

331 Accesses
6 Citations
Explore all metrics

Abstract

Speech is the primary entity for personal communication however ambient quality generally impairs speech signal quality and understanding of communication. Therefore, it is required that the distorted speech signal be improved in its quality and comprehension. In the field of speech processing, great efforts have been made to develop speech enhancement techniques that restore speech signals by reducing the amount of interfering noise. This work focuses on a critical analysis of single channel speech enhancement technique that performs noise reduction through spectral subtraction based on minimal statistics. Minimal statistics implies estimating the power spectrum of a non-standard noise signal by avoiding the problem of detecting speech activity by finding the smallest value for a smooth power spectrum of a noisy speech signal. The performance of the spectral subtraction method is evaluated over a wide range of noise types with varying sound levels using single channel speech data. This estimator is used to find the optimal value for the method parameter and improve this algorithm to make it more suitable for voice communication purposes. The system can be implemented in MATLAB and also validated against a variety of performance measures and various improvements in signal-to-noise ratio (SNRI) and spectral distortion (SD). This approach provides effective speech enhancement in SNRI and SD performance metrics. A comparatively new method has been proposed in this paper named Spectral Statistics Based on Minimum Statistics (SSBMS) which customarily follows the transient noise and provides a better response in the process of speech enhancement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

A review on speech separation in cocktail party environment: challenges and approaches

Article 23 February 2023

A Review on Sound Source Localization Systems

Article 05 May 2022

Data Availability

The datasets generated during and/or analysed during the current study are available in the NOIZEUS repository and IEEE DataPort which is a Dataset Storage and Dataset Search Platform. [1. https://ecs.utdallas.edu/loizou/speech/noizeus/ and 2. https://ieee-dataport.org/keywords/speech-dataset]. For the experimental purpose, the clean speech samples have been taken from NOIZEUS corpus, which is a publicly available speech database and is usually used for benchmark experiments. Data openly available in a public repository that issues datasets with DOIs and Data derived from public domain resources.

Code Availability

NA.

References

Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.
Article Google Scholar
McAulay, R., & Malpass, M. (1980). Speech enhancement using a soft-decision noise suppression filter. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(2), 137–145.
Article Google Scholar
Ghorpade, K., & Khaparde, A. (2022). Single channel speech enhancement using evolutionary algorithm with Log-MMSE. ASEAN Engineering Journal, 12(1), 83–91.
Article Google Scholar
Yang, Y., Zhang, H., Zhang, X., & Zhang, H (2022) Alleviating the Loss-Metric mismatch in supervised single-channel speech enhancement, In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 6952–6956.
Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.
Article Google Scholar
Cappe, O. (1994). Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Transactions on Speech and Audio Processing, 2(2), 345–349.
Article Google Scholar
Martin, R. (2005). Speech enhancement based on minimum mean-square error estimation and super-gaussian priors. IEEE Transactions on Speech and Audio Processing, 13(5), 845–856.
Article Google Scholar
Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Transactions on Audio, Speech, and Language Processing, 20(4), 1383–1393.
Article Google Scholar
Wang, D., & Lim, J. (1982). The unimportance of phase in speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 30(4), 679–681.
Article Google Scholar
M. R. Weiss, A. E. Aschkenasy, and T. W. (1974) Parsons, Study and development of the intel technique for improving speech intelligibility, Nicolet Scientific Corp., Tech. Rep.
Paliwal, K., Wojcicki, K., & Shannon, B. (2011). The importance of phase in speech enhancement. Speech Communication, 53(4), 465–494.
Article Google Scholar
P. Mowlaee and R. Martin (2012) On phase importance in parameter estimation for single-channel source separation. In: Proceedings International Workshop on Acoustic Signal Enhancement, pp. 1–4.
Mowlaee, P., & Saeidi, R. (2013). Iterative closed-loop phase-aware single-channel speech enhancement. IEEE Signal Processing Letter, 20(12), 1235–1239.
Article Google Scholar
Mowlaee, P., & Kulmer, J. (2015). Phase estimation in single-channel speech enhancement: Limits-potential. IEEE Transactions on Audio, Speech, and Language Processing, 23(8), 1283–1294.
Article Google Scholar
Mowlaee, P., & Kulmer, J. (2015). Harmonic phase estimation in single-channel speech enhancement using phase decomposition and SNR information. IEEE Transactions on Audio, Speech, and Language Processing, 23(9), 1521–1532.
Article Google Scholar
Gerkmann, T., & Krawczyk, M. (2013). MMSE-optimal spectral amplitude estimation given the STFT-phase. IEEE Signal Processing Letter, 20(2), 129–132.
Article Google Scholar
Gerkmann, T. (2014). Bayesian estimation of clean speech spectral coefficients given a priori knowledge of the phase. IEEE Transactions on Signal Processing, 62(16), 4199–4208.
Article MATH Google Scholar
Krawczyk, M., & Gerkmann, T. (2014). STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 22(12), 1931–1940.
Article Google Scholar
Krawczyk-Becker, M., & Gerkmann, T. (2016). On MMSE-based estimation of amplitude and complex speech spectral coefficients under phase-uncertainty. IEEE Transactions on Audio, Speech, and Language Processing, 24(12), 2251–2262.
Article Google Scholar
Krawczyk-Becker, M., & Gerkmann, T. (2016). An evaluation of the perceptual quality of phase-aware single-channel speech enhancement. The Journal of the Acoustical Society of America, 140, 364–369.
Article Google Scholar
Deville, Y., Gannot, S., Mason, R., Plumbley, M. D., & Ward, D. (2018). A study on the benefits of phase-aware speech enhancement in challenging noise scenarios. In Y. Deville, S. Gannot, R. Mason, & M. D. Plumbley (Eds.), Latent Variable Analysis and Signal Separation (pp. 407–416). Cham: Springer International Publishing.
Chapter Google Scholar
Mowlaee, P., Saeidi, R., & Stylianou, Y. (2016). Advances in phase-aware signal processing in speech communication. Speech Communication, 81, 1–29.
Article Google Scholar
Gerkmann, T., Krawczyk-Becker, M., & Roux, J. L. (2015). Phase processing for single channel speech enhancement: History and recent advances. IEEE Signal Processing Magazine, 32(2), 55–66.
Article Google Scholar
Krawczyk-Becker, M., & Gerkmann, T. (2018). On speech enhancement under PSD uncertainty. IEEE Transactions on Audio, Speech, and Language Processing, 26(6), 1144–1153.
Article Google Scholar
Xu, Y., Du, J., Dai, L., & Lee, C. (2015). A regression approach to speech enhancement based on deep neural networks. IEEE Transactions on Audio, Speech, and Language Processing, 23(1), 7–19.
Article Google Scholar
M. Kolbaek, Z. Tan, and J. Jensen, (2018) Monaural speech enhancement using deep neural networks by maximizing a short-time objective intelligibility measure.

Download references

Funding

There are no funding Sources of this manuscript.

Author information

Authors and Affiliations

Uttarakhand Technical University, Dehradun, India
Monika Gupta
Kumaon Engineering College, Dwarahat, India
R. K. Singh
NIT New Delhi, Delhi, India
Sachin Singh

Authors

Monika Gupta
View author publications
You can also search for this author in PubMed Google Scholar
R. K. Singh
View author publications
You can also search for this author in PubMed Google Scholar
Sachin Singh
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Ms. Monika Gupta has prepared this manuscript under the guidance of Dr. R.K. Singh and Dr. Sachin Singh.

Corresponding author

Correspondence to Monika Gupta.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest among them.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gupta, M., Singh, R.K. & Singh, S. Analysis of Optimized Spectral Subtraction Method for Single Channel Speech Enhancement. Wireless Pers Commun 128, 2203–2215 (2023). https://doi.org/10.1007/s11277-022-10039-y

Download citation

Accepted: 30 August 2022
Published: 10 September 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11277-022-10039-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of Optimized Spectral Subtraction Method for Single Channel Speech Enhancement

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

A review on speech separation in cocktail party environment: challenges and approaches

A Review on Sound Source Localization Systems

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysis of Optimized Spectral Subtraction Method for Single Channel Speech Enhancement

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

A review on speech separation in cocktail party environment: challenges and approaches

A Review on Sound Source Localization Systems

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation