Noise Suppression Method Based on Modulation Spectrum Analysis

  • Takuto IsoyamaEmail author
  • Masashi UnokiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11096)


Conventional methods for noise suppression can successfully reduce stationary noise. However, non-stationary noise such as intermittent and impulsive noise cannot be sufficiently suppressed since these methods do not focus on temporal features of noise. This paper proposes a method for suppressing both stationary and non-stationary noise based on modulation spectrum analysis. Modulation spectra (MS) of the stationary, intermittent, and impulsive noise were investigated by using the time/frequency/modulation analysis techniques to characterize the MS features. These features were then used to suppress the stationary and non-stationary noise components from the observed signals. Using the proposed method, the direct-current components of the MS in the stationary noise, harmonicity of the MS in the intermittent noise, and higher modulation-frequency components of the MS in the impulsive noise were removed. The following advantages of the proposed method were confirmed: (1) sound pressure level of the noise was dramatically reduced, (2) signal-to-noise ratio of the noisy speech was improved, and (3) loudness, sharpness, and roughness of the restored speech were enhanced. These results indicate that the stationary as well as non-stationary noise can be successfully suppressed using the proposed method.


Noise suppression Modulation spectrum Non-stationary noise Gammatone filterbank Psychoacoustical sound-quality index 



This work was supported by the Secom Science and Technology Foundation by the Suzuki Foundation, and by a Grant in Aid for Innovative Areas (No. 16H01669, and 18H05004) from MEXT, Japan.


  1. 1.
    Boll, S.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 27, 113–120 (1979)CrossRefGoogle Scholar
  2. 2.
    Takehara, R., Kawamura, A., Iiguni, Y.: Impulsive noise suppression using interpolated zero phase signal. In: APSIPA2017, pp. 1382–1389 (2017)Google Scholar
  3. 3.
    Zhiyao, D., Gautham, J.M., Paris, S.: Speech enhancement by online non-negative spectrogram decomposition in non-stationary noise environments. In: Proceedings of Interspeech 2012, pp. 595–598 (2012)Google Scholar
  4. 4.
    Stephan, D.E., Torsten, D.: Characterizing frequency selectivity for envelope fluctuations. J. Acoust. Soc. Am. 108, 1181 (2000)CrossRefGoogle Scholar
  5. 5.
    Patterson, R., Nimmo-Smith, L., Holdsworth, J., Rice, P.: An auditory filter bank based on the gammatone function. Paper Presented at a Meeting of the IOC Speech Group on Auditory Modelling at RSRE, pp. 14–15 (1987)Google Scholar
  6. 6.
    Kondo, T., Amano, S., Sakamoto, S., Susuki, Y.: Development of familiarity-controlled word-lists (FW07). IEICE Tech. Rep. 107(436), 43–48 (2008)Google Scholar
  7. 7.
    Varga, A., Steeneken, J.M.H.: Assessment for automatic speech recognition: II. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun. 12(13), 247–251 (1993)CrossRefGoogle Scholar
  8. 8.
    Atlas, L., Greenberg, S., Hermansky, H.: The Modulation Spectrum and Its Application to Speech Science and Technology. Interspeech Tutorial, Antwerp (2007)Google Scholar
  9. 9.
    Kanai, Y., Morita, S., Unoki, M.: Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysis. In: Proceedings of INTERSPEECH, pp. 742–746 (2013)Google Scholar
  10. 10.
    Zwicker, F.: Psychoacoustics: Facts and Models. Springer, Heidelberg (2007). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Japan Advanced Institute of Science and TechnologyNomiJapan

Personalised recommendations