Abstract
The presence of noise in speech signals can significantly degrade the performance of speech recognition systems. A threshold denoising method based on mathematical morphology is proposed to reduce background white noise. In the method we consider speech spectrograms as images and construct binary images from a normalized 256-level gray scale spectrogram image. We take advantage of a sudden slowing in the average value (ratio of the number of ‘1’ pixels to the total pixel number) of the binary image, and use it as the threshold value to zero spectrogram elements below the threshold, normalize the spectrogram, and finally, reconstruct the original speech signal to achieve the goal of speech enhancement. The main advantage of the algorithm is fast speed that is highly desired in real-time speech processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ajmera, P.K., Jadhav, D.V., Holambe, R.S.: Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram. Pattern Recogn. 44, 2749–2759 (2011)
Alsteris, L.D., Paliwal, K.K.: Iterative reconstruction of speech from short-time Fourier transform phase and magnitude spectra. Comput. Speech Lang. 21, 174–186 (2007)
Berouti, M., Schwartz, R., Makhoul, J.: Enhancement of speech corrupted by acoustic noise. In: IEEE, pp. 208–211 (1979)
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27, 113–120 (1979)
Cohen, L.: Time-frequency distributions - a review. Proc. IEEE 77, 941–981 (1989)
Dennis, J., Tran, H.D., Li, H.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18, 130–133 (2011)
Mallawaarachchi, A., Ong, S.H., Chitre, M., Taylor, E.: Spectrogram denoising and automated extraction of the fundamental frequency variation of dolphin whistles. J. Acoust. Soc. Am. 124, 1159–1170 (2008)
Pinkowski, B.: Principal component analysis of speech spectrogram images. Pattern Recogn. 30, 777–787 (1997)
Soille, P.: Morphological image analysis: principles and applications. Springer Science & Business Media, Heidelberg (2013)
Steinberg, R., Shaughnessy, D.O.: Segmentation of a speech spectrogram using mathematical morphology. In: IEEE, pp. 1637–1640 (2008)
Xu, H., Tan, Z.-H., Dalsgaard, P., Lindberg, B.: Robust speech recognition by nonlocal means denoising processing. IEEE Signal Process. Lett. 15, 701–704 (2008)
Acknowledgements
This work was supported by the Natural Science Foundation of China (No. 61471111)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, G., Zheng, C., Xu, T., Cao, X., Xingpeng, M., Wang, S. (2019). A Threshold Denoising Algorithm Based on Mathematical Morphology for Speech Enhancement. In: Liang, Q., Mu, J., Jia, M., Wang, W., Feng, X., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2017. Lecture Notes in Electrical Engineering, vol 463. Springer, Singapore. https://doi.org/10.1007/978-981-10-6571-2_215
Download citation
DOI: https://doi.org/10.1007/978-981-10-6571-2_215
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6570-5
Online ISBN: 978-981-10-6571-2
eBook Packages: EngineeringEngineering (R0)