Blind Noise Reduction for Speech Enhancement by Simulated Auditory Nerve Representations

  • Anton YakovenkoEmail author
  • Aleksandr Antropov
  • Galina Malykhina
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11555)


Background and environmental noises negatively affect the quality of verbal communication between humans as well as in human-computer interaction. However, this problem is efficiently solved by a healthy auditory system. Hence, the knowledge about the physiology of auditory perception can be used along with noise reduction algorithms to enhance speech intelligibility. The paper suggests an approach to noise reduction at the level of the auditory periphery. The approach involves an adaptive neural network algorithm of independent component analysis for blind source separation using simulated auditory nerve firing probability patterns. The approach has been applied to several categories of colored noise models and real-world acoustic scenes. The suggested technique has significantly increased the signal-to-noise ratio for the auditory nerve representations of complex sounds due to the variability in spatial positioning of sound sources and a flexible number of sensors.


Speech enhancement Noise reduction Blind source separation Independent component analysis Machine hearing Auditory periphery model Auditory nerve responses FastICA 



The reported study was funded by the Russian Foundation for Basic Research according to the research project No 18-31-00304.


  1. 1.
    Bergman, A.S.: Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, Cambridge (1994)Google Scholar
  2. 2.
    Wang, D.L., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Wiley-IEEE Press, Hoboken (2006)Google Scholar
  3. 3.
    Nugraha, A.A., Liutkus, A., Vincent, E.: Deep neural network based multichannel audio source separation. In: Makino, S. (ed.) Audio Source Separation. SCT, pp. 157–185. Springer, Cham (2018). Scholar
  4. 4.
    Schwartz, O., David, A., Shahen-Tov, O., Gannot, S.: Multi-microphone voice activity and single-talk detectors based on steered-response power output entropy. In: 2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE), pp. 1–4 (2018)Google Scholar
  5. 5.
    Bu, S., Zhao, Y., Hwang, M.Y., Sun, S.: A robust nonlinear microphone array postfilter for noise reduction. In: 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 206–210 (2018)Google Scholar
  6. 6.
    Alam, M.S., Jassim, W.A., Zilany, M.S.A.: Neural response based phoneme classification under noisy condition. In: Proceedings of International Symposium on Intelligent Signal Processing and Communication Systems, pp. 175–179 (2014)Google Scholar
  7. 7.
    Miller, R.L., Schilling, J.R., Franck, K.R., Young, E.D.: Effects of acoustic trauma on the representation of the vowel “eh” in cat auditory nerve fibers. J. Acoust. Soc. Am. 101(6), 3602–3616 (1997)Google Scholar
  8. 8.
    Kim, D.-S., Lee, S.-Y., Kil, R.M.: Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Trans. Speech Audio Process. 7(1), 55–69 (1999)Google Scholar
  9. 9.
    Brown, G.J., Ferry, R.T., Meddis, R.: A computer model of auditory efferent suppression: implications for the recognition of speech in noise. J. Acoust. Soc. Am. 127(2), 943–954 (2010)Google Scholar
  10. 10.
    Jurgens, T., Brand, T., Clark, N.R., Meddis, R., Brown, G.J.: The robustness of speech representations obtained from simulated auditory nerve fibers under different noise conditions. J. Acoust. Soc. Am. 134(3), 282–288 (2013)Google Scholar
  11. 11.
    Yakovenko, A., Sidorenko, E., Malykhina, G.: Semi-supervised classifying of modelled auditory nerve patterns for vowel stimuli with additive noise. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y. (eds.) NEUROINFORMATICS 2018. SCI, vol. 799, pp. 234–240. Springer, Cham (2019). Scholar
  12. 12.
    Liberman, M.C.: Auditory nerve response from cats raised in a low noise chamber. J. Acoust. Soc. Am. 63(2), 442–455 (1978)Google Scholar
  13. 13.
    Lopez-Poveda, E., Meddis, R.: A human nonlinear cochlear filterbank. J. Acoust. Soc. Am. 110, 3107–3118 (2001)Google Scholar
  14. 14.
    Houda, A., Otman, C.: Blind audio source separation: state-of-art. Int. J. Comput. Appl. 130(4), 1–6 (2015)Google Scholar
  15. 15.
    Vorobyov, S., Cichocki, A.: Blind noise reduction for multisensory signals using ICA and subspace filtering, with application to EEG analysis. Biol. Cybern. 86(4), 293–303 (2002)Google Scholar
  16. 16.
    Heittola, T., Mesaros, A., Virtanen, T.: TUT Urban Acoustic Scenes 2018, Development dataset [Data set]. Zenodo.
  17. 17.
    Miettinen, J., Nordhausen, K., Taskinen, S.: fICA: FastICA algorithms and their improved variants. R J. 10(2), 148–158 (2018)Google Scholar
  18. 18.
    Yakovenko, A.A., Malykhina, G.F.: Bio-inspired approach for automatic speaker clustering using auditory modeling and self-organizing maps. Procedia Comput. Sci. 123, 547–552 (2018)Google Scholar
  19. 19.
    Kokkinakis, K., Azimi, B., Hu, Y., Friedland, D.R.: Single and multiple microphone noise reduction strategies in cochlear implants. Trends Amplif. 16(2), 102–116 (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Anton Yakovenko
    • 1
    Email author
  • Aleksandr Antropov
    • 1
  • Galina Malykhina
    • 1
    • 2
  1. 1.Peter the Great St. Petersburg Polytechnic UniversitySt. PetersburgRussia
  2. 2.Russian State Scientific Center for Robotics and Technical CyberneticsSt. PetersburgRussia

Personalised recommendations