Skip to main content

A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise

  • Chapter
  • First Online:
Non-negative Matrix Factorization Techniques

Part of the book series: Signals and Communication Technology ((SCT))

  • 1775 Accesses

Abstract

This article addresses the problem of single-channel speech enhancement in the presence of nonstationary noise. A novel-modified NMF-based filter bank approach is proposed for speech enhancement. The method consists of filter bank analysis of the noisy input signal followed by extraction of speech signal based on a modified NMF (MNMF) by learning a speaker-independent speech dictionary using a precomputed noise dictionary. The proposed method works well with the real-world nonstationary noise independent to the speaker. The method is evaluated using a speech database consists of different speakers showing promising enhancement performance by reducing the nonstationary noise together with improving PESQ (perceptual evaluation of speech quality) compared to other competitive state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The Noizeus speech corpus is publicly available online at the following url: http://www.utdallas.edu/~loizou/speech/noizeus.

References

  1. Y. Hu, P.C. Loizou, Subjective comparison of speech enhancement algorithms, in Proceedings ICASSP’06, vol. 1 (2006), pp. 153–156

    Google Scholar 

  2. E. Hansler, G. Schmidt, Topics in Acoustic Echo and Noise Control (Springer, Berlin, 2006)

    Book  Google Scholar 

  3. Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, in IEEE Transactions on Audio, Speech, and Language Processing, vol. ASSP-32 (1984), pp. 1109–1121

    Google Scholar 

  4. N. Virag, Single channel speech enhancement based on masking properties of the human auditory system. IEEE Trans. Speech Audio Process. 7(2), 497–513 (1997)

    Google Scholar 

  5. D.E. Tsoukalas, J.N. Mourjopoulos, G. Kokkinakis, Speech enhancement based on audible noise suppression. IEEE Trans. Speech Audio Process. 5(6), 497–513 (1997)

    Article  Google Scholar 

  6. D.D. Lee, H.S. Seung, Algorithms for non-negative matrix factorization. Adv. Neural Inf. Proc. Syst. 13, 556–562 (2001)

    Google Scholar 

  7. Y.-X. Wang, Y.-J. Zhang, Nonnegative matrix factorization: a comprehensive review. IEEE Trans. Knowl. Data Eng. 25, 6 (2013)

    Google Scholar 

  8. A.K. Kattepur, F. Jin, F. Sattar, Single channel source separation for convolutive mixture with application to respiratory sounds, in IEEE-EMBS Conference on Bio-inspired Systems and Signal Processing (BIOSIGNALS) (2010)

    Google Scholar 

  9. B. Ghoraani, S. Krishnan, Time-frequency matrix feature extraction and classification of environmental audio signals. IEEE Trans. Audio Speech Lang. Process. 19(7), 2197–2209 (2011)

    Google Scholar 

  10. M. Kim, J. Yoo, K. Kang, S. Choi, Nonnegative matrix partial co-factorization for spectral and temporal drum source separation. IEEE J. Sel. Top. Signal Process. 5(6), 1192–1204 (2011)

    Article  Google Scholar 

  11. G. Shah, P. Koch, C.B. Papadias, On the blind recovery of cardiac and respiratory sounds. IEEE J. Biomed. Health Inform. 19(1), 151–157 (2015)

    Article  Google Scholar 

  12. H. Chung, E. Plourde, B. Champagne, Regularized NMF-based speech enhancement with spectral components modeled by Gaussian mixures, in IEEE International Workshop on Machine Learning for Signal Processing (2014), pp. 21–24

    Google Scholar 

  13. H.-T. Fan, J.-W. Hung, X. Lu, S.-S. Wang, Y. Tsao, Speech enhancement using segmental nonnegative matrix factorization, in ICASSP’14 (2014), pp. 4483–4487

    Google Scholar 

  14. K. Kwon, J.W. Shin, S. Sonowal, I. Choi, N.S. Kim, Speech enhancement combining statistical models and NMF with update of speech and noise bases, in ICASSP’14 (2014)

    Google Scholar 

  15. T.G. Kang, K. Kwon, J.W. Shin, N.S. Kim, NMF-based target source separation using deep neural network. IEEE Signal Process. Lett. 22, 2 (2015)

    Google Scholar 

  16. Y. Hu, M. Bhatnagar, P. Loizou, A cross correlation technique for enhancing speech corrupted by colored noise, in Proceedings of the IEEE Conference on Acoustics, Speech, Signal Processing (2001), pp. 673–676

    Google Scholar 

  17. S.-J. Lee, S.-H. Kim, Noise estimation based on standard deviation and sigmoid function using a posteriori signal to noise ratio in nonstationary noisy environments. Int. J. Control Autom. Syst. 6(6), 818–827 (2008)

    Google Scholar 

  18. P.P. Vaidyanathan, Multirate Systems and Filter Banks (Prentice-Hall, Englewood Cliffs, 1993)

    Google Scholar 

  19. A.N. Akansu, R.A. Haddad, Multiresolution Signal Decomposition: Transforms, Subbands, and Wavelets (Academic Press, Orlando, 1992)

    Google Scholar 

  20. M.R. Portnoff, Time-frequency representation of digital signals and systems based on short-time Fourier analysis. IEEE Trans. ASSP 28(1), 55–69 (1980)

    Article  MATH  Google Scholar 

  21. P.O. Hoyer, Non-negative sparse coding, in IEEE Workshop Neural Networks for Signal Processing (2002), pp. 557–565

    Google Scholar 

  22. J. Eggert, E. Kmrner, Sparse coding and NMF, in IEEE Conference Neural Networks, vol. 4 (2004), pp. 2529–2533

    Google Scholar 

  23. C.J. Lin, Projected gradient methods for non negative matrix factorization. Neural Comput. 19(10), 2756–2779 (2007)

    Google Scholar 

  24. D. Kim, S. Sra, I.S. Dhillon, Fast newton-type methods for the least squares non negative matrix approximation problem, in SIAM Conference on Data Mining (2007)

    Google Scholar 

  25. Y. Hu, P.C. Loizou, Subjective comparison and evaluation of speech enhancement algorithms. Speech Commun. 49(78), 588–601 (2007)

    Article  Google Scholar 

  26. H. Hirsch, D. Pearce, The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, in ISCA ITRW ASR2000, France (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farook Sattar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Sattar, F., Jin, F. (2016). A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise. In: Naik, G. (eds) Non-negative Matrix Factorization Techniques. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48331-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-48331-2_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-48330-5

  • Online ISBN: 978-3-662-48331-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics