Advertisement

A Threshold Denoising Algorithm Based on Mathematical Morphology for Speech Enhancement

  • Guangyan Li
  • Caixia Zheng
  • Tingfa Xu
  • Xiaolin Cao
  • Mao Xingpeng
  • Shuangwei Wang
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 463)

Abstract

The presence of noise in speech signals can significantly degrade the performance of speech recognition systems. A threshold denoising method based on mathematical morphology is proposed to reduce background white noise. In the method we consider speech spectrograms as images and construct binary images from a normalized 256-level gray scale spectrogram image. We take advantage of a sudden slowing in the average value (ratio of the number of ‘1’ pixels to the total pixel number) of the binary image, and use it as the threshold value to zero spectrogram elements below the threshold, normalize the spectrogram, and finally, reconstruct the original speech signal to achieve the goal of speech enhancement. The main advantage of the algorithm is fast speed that is highly desired in real-time speech processing.

Notes

Acknowledgements

This work was supported by the Natural Science Foundation of China (No. 61471111)

References

  1. Ajmera, P.K., Jadhav, D.V., Holambe, R.S.: Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram. Pattern Recogn. 44, 2749–2759 (2011)CrossRefGoogle Scholar
  2. Alsteris, L.D., Paliwal, K.K.: Iterative reconstruction of speech from short-time Fourier transform phase and magnitude spectra. Comput. Speech Lang. 21, 174–186 (2007)CrossRefGoogle Scholar
  3. Berouti, M., Schwartz, R., Makhoul, J.: Enhancement of speech corrupted by acoustic noise. In: IEEE, pp. 208–211 (1979)Google Scholar
  4. Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27, 113–120 (1979)CrossRefGoogle Scholar
  5. Cohen, L.: Time-frequency distributions - a review. Proc. IEEE 77, 941–981 (1989)CrossRefGoogle Scholar
  6. Dennis, J., Tran, H.D., Li, H.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18, 130–133 (2011)CrossRefGoogle Scholar
  7. Mallawaarachchi, A., Ong, S.H., Chitre, M., Taylor, E.: Spectrogram denoising and automated extraction of the fundamental frequency variation of dolphin whistles. J. Acoust. Soc. Am. 124, 1159–1170 (2008)CrossRefGoogle Scholar
  8. Pinkowski, B.: Principal component analysis of speech spectrogram images. Pattern Recogn. 30, 777–787 (1997)CrossRefGoogle Scholar
  9. Soille, P.: Morphological image analysis: principles and applications. Springer Science & Business Media, Heidelberg (2013)MATHGoogle Scholar
  10. Steinberg, R., Shaughnessy, D.O.: Segmentation of a speech spectrogram using mathematical morphology. In: IEEE, pp. 1637–1640 (2008)Google Scholar
  11. Xu, H., Tan, Z.-H., Dalsgaard, P., Lindberg, B.: Robust speech recognition by nonlocal means denoising processing. IEEE Signal Process. Lett. 15, 701–704 (2008)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Guangyan Li
    • 1
  • Caixia Zheng
    • 2
  • Tingfa Xu
    • 3
  • Xiaolin Cao
    • 4
  • Mao Xingpeng
    • 5
  • Shuangwei Wang
    • 1
  1. 1.School of PhysicsNortheast Normal UniversityChangchunChina
  2. 2.School of Computer Science and Information TechnologyNortheast Normal UniversityChangchunChina
  3. 3.Photoelectric Imaging and Information Engineering InstituteBeijing Institute of TechnologyBeijingChina
  4. 4.College of Automotive EngineeringJilin UniversityChangchunChina
  5. 5.School of Electronics and Information EngineeringHarbin Institute of TechnologyHarbinChina

Personalised recommendations