A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise

Sattar, Farook; Jin, Feng

doi:10.1007/978-3-662-48331-2_5

Farook Sattar² &
Feng Jin³

Part of the book series: Signals and Communication Technology ((SCT))

1775 Accesses

Abstract

This article addresses the problem of single-channel speech enhancement in the presence of nonstationary noise. A novel-modified NMF-based filter bank approach is proposed for speech enhancement. The method consists of filter bank analysis of the noisy input signal followed by extraction of speech signal based on a modified NMF (MNMF) by learning a speaker-independent speech dictionary using a precomputed noise dictionary. The proposed method works well with the real-world nonstationary noise independent to the speaker. The method is evaluated using a speech database consists of different speakers showing promising enhancement performance by reducing the nonstationary noise together with improving PESQ (perceptual evaluation of speech quality) compared to other competitive state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The Noizeus speech corpus is publicly available online at the following url: http://www.utdallas.edu/~loizou/speech/noizeus.

References

Y. Hu, P.C. Loizou, Subjective comparison of speech enhancement algorithms, in Proceedings ICASSP’06, vol. 1 (2006), pp. 153–156
Google Scholar
E. Hansler, G. Schmidt, Topics in Acoustic Echo and Noise Control (Springer, Berlin, 2006)
Book Google Scholar
Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, in IEEE Transactions on Audio, Speech, and Language Processing, vol. ASSP-32 (1984), pp. 1109–1121
Google Scholar
N. Virag, Single channel speech enhancement based on masking properties of the human auditory system. IEEE Trans. Speech Audio Process. 7(2), 497–513 (1997)
Google Scholar
D.E. Tsoukalas, J.N. Mourjopoulos, G. Kokkinakis, Speech enhancement based on audible noise suppression. IEEE Trans. Speech Audio Process. 5(6), 497–513 (1997)
Article Google Scholar
D.D. Lee, H.S. Seung, Algorithms for non-negative matrix factorization. Adv. Neural Inf. Proc. Syst. 13, 556–562 (2001)
Google Scholar
Y.-X. Wang, Y.-J. Zhang, Nonnegative matrix factorization: a comprehensive review. IEEE Trans. Knowl. Data Eng. 25, 6 (2013)
Google Scholar
A.K. Kattepur, F. Jin, F. Sattar, Single channel source separation for convolutive mixture with application to respiratory sounds, in IEEE-EMBS Conference on Bio-inspired Systems and Signal Processing (BIOSIGNALS) (2010)
Google Scholar
B. Ghoraani, S. Krishnan, Time-frequency matrix feature extraction and classification of environmental audio signals. IEEE Trans. Audio Speech Lang. Process. 19(7), 2197–2209 (2011)
Google Scholar
M. Kim, J. Yoo, K. Kang, S. Choi, Nonnegative matrix partial co-factorization for spectral and temporal drum source separation. IEEE J. Sel. Top. Signal Process. 5(6), 1192–1204 (2011)
Article Google Scholar
G. Shah, P. Koch, C.B. Papadias, On the blind recovery of cardiac and respiratory sounds. IEEE J. Biomed. Health Inform. 19(1), 151–157 (2015)
Article Google Scholar
H. Chung, E. Plourde, B. Champagne, Regularized NMF-based speech enhancement with spectral components modeled by Gaussian mixures, in IEEE International Workshop on Machine Learning for Signal Processing (2014), pp. 21–24
Google Scholar
H.-T. Fan, J.-W. Hung, X. Lu, S.-S. Wang, Y. Tsao, Speech enhancement using segmental nonnegative matrix factorization, in ICASSP’14 (2014), pp. 4483–4487
Google Scholar
K. Kwon, J.W. Shin, S. Sonowal, I. Choi, N.S. Kim, Speech enhancement combining statistical models and NMF with update of speech and noise bases, in ICASSP’14 (2014)
Google Scholar
T.G. Kang, K. Kwon, J.W. Shin, N.S. Kim, NMF-based target source separation using deep neural network. IEEE Signal Process. Lett. 22, 2 (2015)
Google Scholar
Y. Hu, M. Bhatnagar, P. Loizou, A cross correlation technique for enhancing speech corrupted by colored noise, in Proceedings of the IEEE Conference on Acoustics, Speech, Signal Processing (2001), pp. 673–676
Google Scholar
S.-J. Lee, S.-H. Kim, Noise estimation based on standard deviation and sigmoid function using a posteriori signal to noise ratio in nonstationary noisy environments. Int. J. Control Autom. Syst. 6(6), 818–827 (2008)
Google Scholar
P.P. Vaidyanathan, Multirate Systems and Filter Banks (Prentice-Hall, Englewood Cliffs, 1993)
Google Scholar
A.N. Akansu, R.A. Haddad, Multiresolution Signal Decomposition: Transforms, Subbands, and Wavelets (Academic Press, Orlando, 1992)
Google Scholar
M.R. Portnoff, Time-frequency representation of digital signals and systems based on short-time Fourier analysis. IEEE Trans. ASSP 28(1), 55–69 (1980)
Article MATH Google Scholar
P.O. Hoyer, Non-negative sparse coding, in IEEE Workshop Neural Networks for Signal Processing (2002), pp. 557–565
Google Scholar
J. Eggert, E. Kmrner, Sparse coding and NMF, in IEEE Conference Neural Networks, vol. 4 (2004), pp. 2529–2533
Google Scholar
C.J. Lin, Projected gradient methods for non negative matrix factorization. Neural Comput. 19(10), 2756–2779 (2007)
Google Scholar
D. Kim, S. Sra, I.S. Dhillon, Fast newton-type methods for the least squares non negative matrix approximation problem, in SIAM Conference on Data Mining (2007)
Google Scholar
Y. Hu, P.C. Loizou, Subjective comparison and evaluation of speech enhancement algorithms. Speech Commun. 49(78), 588–601 (2007)
Article Google Scholar
H. Hirsch, D. Pearce, The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, in ISCA ITRW ASR2000, France (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada
Farook Sattar
Department of Electrical and Computer Engineering, Ryerson University, Toronto, ON, Canada
Feng Jin

Authors

Farook Sattar
View author publications
You can also search for this author in PubMed Google Scholar
Feng Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Farook Sattar .

Editor information

Editors and Affiliations

University of Technology Sydney, Sydney, New South Wales, Australia
Ganesh R. Naik

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sattar, F., Jin, F. (2016). A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise. In: Naik, G. (eds) Non-negative Matrix Factorization Techniques. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48331-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-662-48331-2_5
Published: 26 September 2015
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-48330-5
Online ISBN: 978-3-662-48331-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics