Abstract
The major drawback of the most widely used spectral subtraction (SS) algorithm is, it fails to reduce musical noise. In addition to this in SS, the subtraction rules are mainly based on false assumptions about cross-terms are being zero. A novel approach is proposed to overcome these shortcomings in the SS algorithm. A technique is implemented to calculate exactly the cross-terms which involve the differences in phase amidst degraded speech signal and noise model. The proposed technique gain function is having the same properties as traditional minimum mean square error (MMSE) algorithms. The experimental results on NOIZEUS speech corpora reveal that the proposed algorithm outperforms the traditional SS algorithms in terms of speech quality and intelligibility at lower SNR conditions. Further, the output of the proposed approach shows that there is no audibility of musical noise in processed or enhanced speech sound. The numerical complexity computation and pictorial representation of input-output waveforms and corresponding spectrograms of proposed and existing speech enhancement techniques are also presented in this work.
Similar content being viewed by others
References
Berouti M, Schwartz M, Makhoul J (1979) Enhancement of speech corrupted by acoustic noise. Proc IEEE Int Conf on Acoustics, Speech, and Signal Processing, pp 208–211
Boll S F (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans Acoust Speech Signal Process, ASSP-27 2:113–120
Cappe O (1994) Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Trans Speech Audio Process 2(2):346–349
Ephraim Y, Malah D (1984) Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans Acoust Speech Signal Process ASSP-32 4:1109–1121
Evans N, Mason J, Liu W, Fauve B (2006) An assessment on the fundamental limitations of spectral subtraction. Proc IEEE Int Conf on Acoustics, Speech, Signal Processing, 145–148
Hu Y, Loizou Philipos C (2007) Subjective evaluation and comparison of speech enhancement algorithms. Speech Comm 49:588–601
Hu Y, Loizou PC (2008) Evaluation of objective quality measures for speech enhancement. IEEE Trans Audio Speech Language Processing 16 (1):229–238
ITU (2000) Perceptual evaluation of speech quality (PESQ) and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation p 862
Kitaoka N, Nakagawa S (2002) Evaluation of spectral subtraction with smoothing of time direction on the AURORA 2 task. Proc of Int Conf Spoken Language Processing, 1:477–480
Lockwood P, Boudy J (1992) Experiments with a non-linear spectral subtractor (NSS) hidden Markov models and the projections for robust recognition in cars. Speech Comm 11:215–228
Martin R (2001) Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans Speech Audio Process 9(5):504–512
Papoulis A, Pillai S (2002) Probability random variables and stochastic processes, 4th editin. McGraw-Hill, Inc., New York
Pearce D, Hirsch H (2000) The AURORA experimental framework for the performance evaluation of speech recognition systems under noise conditions, Proc ISCA ITRW ASR 2000 Interspeech
Philipos C (2005) Loizou, Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum. IEEE Trans Speech Audio Process 13(5):857–869
Philipos C (2007) Loizou, speech enhancement: theory and practice, CRC press LLC, Boca Raton FL
Sunil K, Loizou PC (2002) A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. Proc IEEE Int Conf on Acoustics, Speech, and Signal Processing 4:4164–4164
Thimmaraja Yadava G, Jayanna H S (2018) Speech Enhancement by Combining Spectral Subtraction and Minimum Mean Square Error-Spectrum Power Estimator based on Zero Crossing. International Journal of Speech Technology, Springer 22(3):639–648
Virag N (1999) Single channel speech enhancement based on masking properties of the human auditory system. IEEE Trans Speech Audio Process 7 (3):126–137
Weiss M, Aschkenasy E, Parsons T (1974) Study and the development of the INTEL technique for improving speech intelligibility, Technical Report NSC-FR/4023 Nicolet Scientific Corporation
Yang L, Loizou PC (2008) A geometric approach to spectral subtraction. Speech Comm 50:446–453
Yoma N, McInnes F, Jack M (1998) Improving performance of spectral subtraction in speech recognition using a model for additive noise. IEEE Trans Speech Audio Process 6(6):579–582
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Manuscript has no associated data.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Nagaraja B G and Jayanna H S contributed equally to this work.
Rights and permissions
About this article
Cite this article
G, T.Y., G, N.B. & S, J.H. A spatial procedure to spectral subtraction for speech enhancement. Multimed Tools Appl 81, 23633–23647 (2022). https://doi.org/10.1007/s11042-022-12152-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12152-3