Abstract
A speech denoising method based on Non-Negative Matrix Factorization (NMF) is presented in this paper. With respect to previous related works, this paper makes two contributions. First, our method does not assume a priori knowledge about the nature of the noise. Second, it combines the use of the Kullback-Leibler divergence with sparseness constraints on the activation matrix, improving the performance of similar techniques that minimize the Euclidean distance and/or do not consider any sparsification. We evaluate the proposed method for both, speech enhancement and automatic speech recognitions tasks, and compare it to conventional spectral subtraction, showing improvements in speech quality and recognition accuracy, respectively, for different noisy conditions.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Scalart, P., Filho, J.: Speech enhancement based on a priori signal to noise estimation. In: ICASSP 1996, pp. 629–632 (1996)
Berouti, M., Schwartz, R., Makhoul, J.: Enhancement of speech corrupted by acoustic noise. In: ICASSP 1979, pp. 208–211 (1979)
Wilson, K., Raj, B., Smaragdis, P., Divakaran, A.: Speech denoising using nonnegative matrix factorization with priors. In: ICASSP 2008, pp. 4029–4032 (2008)
Virtanen, T.: Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. on Audio, Speech and Language Processing 15(3), 1066–1074 (2007)
Schmidt, M., Olsson, R.: Single-channel speech separation using sparse non-negative matrix factorization. In: INTERSPEECH 2006 (2006)
Schuller, B., Weninger, F., Wollmer, M., Sun, Y., Rigoll, G.: Non-negative matrix factorization as noise-robust feature extractor for speech recognition. In: ICASSP 2010, pp. 4562–4565 (2010)
Cichocki, A., Zdunek, R., Amari, S.: New algorithms for non-negative matrix factorization in applications to blind source separation. In: ICASSP 2006, pp. 621–625 (2006)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Cichocki, A., Zdunek, R., Phan, A., Amari, S.: Nonnegative matrix and tensor factorizations. John Wiley and Sons, United Kingdom (2009)
Pearce, D., Hans, G.: The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions. In: ICSLP 2000 (2000)
Beerends, J., Hekstra, A., Rix, A., Hollier, M.: Perceptual evaluation of speech quality (PESQ), the new ITU standard for end-to-end speech quality assessment. Part II. Psychoacoustic model. Journal of the Audio Engineering Society 50(10), 765–778 (2002)
Hu, Y., Loizou, P.: Matlab software (2011), http://www.utdallas.edu/~loizou/speech/software.htm
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ludeña-Choez, J., Gallardo-Antolín, A. (2012). Speech Denoising Using Non-negative Matrix Factorization with Kullback-Leibler Divergence and Sparseness Constraints. In: Torre Toledano, D., et al. Advances in Speech and Language Technologies for Iberian Languages. Communications in Computer and Information Science, vol 328. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35292-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-35292-8_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35291-1
Online ISBN: 978-3-642-35292-8
eBook Packages: Computer ScienceComputer Science (R0)