A Threshold Denoising Algorithm Based on Mathematical Morphology for Speech Enhancement

  • Guangyan Li
  • Caixia ZhengEmail author
  • Tingfa Xu
  • Xiaolin Cao
  • Mao Xingpeng
  • Shuangwei Wang
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 463)


The presence of noise in speech signals can significantly degrade the performance of speech recognition systems. A threshold denoising method based on mathematical morphology is proposed to reduce background white noise. In the method we consider speech spectrograms as images and construct binary images from a normalized 256-level gray scale spectrogram image. We take advantage of a sudden slowing in the average value (ratio of the number of ‘1’ pixels to the total pixel number) of the binary image, and use it as the threshold value to zero spectrogram elements below the threshold, normalize the spectrogram, and finally, reconstruct the original speech signal to achieve the goal of speech enhancement. The main advantage of the algorithm is fast speed that is highly desired in real-time speech processing.



This work was supported by the Natural Science Foundation of China (No. 61471111)


  1. Ajmera, P.K., Jadhav, D.V., Holambe, R.S.: Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram. Pattern Recogn. 44, 2749–2759 (2011)Google Scholar
  2. Alsteris, L.D., Paliwal, K.K.: Iterative reconstruction of speech from short-time Fourier transform phase and magnitude spectra. Comput. Speech Lang. 21, 174–186 (2007)Google Scholar
  3. Berouti, M., Schwartz, R., Makhoul, J.: Enhancement of speech corrupted by acoustic noise. In: IEEE, pp. 208–211 (1979)Google Scholar
  4. Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27, 113–120 (1979)Google Scholar
  5. Cohen, L.: Time-frequency distributions - a review. Proc. IEEE 77, 941–981 (1989)Google Scholar
  6. Dennis, J., Tran, H.D., Li, H.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18, 130–133 (2011)Google Scholar
  7. Mallawaarachchi, A., Ong, S.H., Chitre, M., Taylor, E.: Spectrogram denoising and automated extraction of the fundamental frequency variation of dolphin whistles. J. Acoust. Soc. Am. 124, 1159–1170 (2008)Google Scholar
  8. Pinkowski, B.: Principal component analysis of speech spectrogram images. Pattern Recogn. 30, 777–787 (1997)Google Scholar
  9. Soille, P.: Morphological image analysis: principles and applications. Springer Science & Business Media, Heidelberg (2013)Google Scholar
  10. Steinberg, R., Shaughnessy, D.O.: Segmentation of a speech spectrogram using mathematical morphology. In: IEEE, pp. 1637–1640 (2008)Google Scholar
  11. Xu, H., Tan, Z.-H., Dalsgaard, P., Lindberg, B.: Robust speech recognition by nonlocal means denoising processing. IEEE Signal Process. Lett. 15, 701–704 (2008)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Guangyan Li
    • 1
  • Caixia Zheng
    • 2
    Email author
  • Tingfa Xu
    • 3
  • Xiaolin Cao
    • 4
  • Mao Xingpeng
    • 5
  • Shuangwei Wang
    • 1
  1. 1.School of PhysicsNortheast Normal UniversityChangchunChina
  2. 2.School of Computer Science and Information TechnologyNortheast Normal UniversityChangchunChina
  3. 3.Photoelectric Imaging and Information Engineering InstituteBeijing Institute of TechnologyBeijingChina
  4. 4.College of Automotive EngineeringJilin UniversityChangchunChina
  5. 5.School of Electronics and Information EngineeringHarbin Institute of TechnologyHarbinChina

Personalised recommendations