Hybrid Method for Speech Enhancement Using α-Divergence

Sunnydayal, V.; Sirisha Devi, J.; Nandyala, Siva Prasad

doi:10.1007/978-981-13-7082-3_48

V. Sunnydayal¹³,
J. Sirisha Devi¹⁴ &
Siva Prasad Nandyala¹⁵

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 74))

658 Accesses

Abstract

A hybrid method for speech enhancement based on Non-Negative Matrix Factorization (NMF) and statistical modeling is presented for using speech and noise bases with online updating is proposed. In the presence of nonstationary noises, template-based approaches have shown better performance when compared to statistical modeling but these approaches depend on a priori information. To overcome the drawbacks of these approaches, a hybrid method is developed. The performance of the proposed method is further improved by considering speech bases as well as noise bases. In terms of Source-to-Distortion ratio (SDR) and Perceptual Evaluation of Speech Quality (PESQ) the proposed method have outperformed the traditional algorithms in nonstationary noise environment conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Iterative Posterior Regularized NMF-Based Adaptive Wiener Filter for Speech Enhancement

A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence

Article Open access 08 September 2022

Sparse NMF based speech enhancement with bases update

Article 09 May 2017

References

Miyazaki R, Inoue T, Takahashi K, Kondo K, Saruwatari, H, Shikano Y (2012) Musical-noise-free speech enhancement based on optimized iterative spectral subtraction. IEEE Trans Audio Speech Lang Process 20(7):2080–2094
Google Scholar
Ephraim Y, Malah D (1985) Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans Acoust Speech Signal Process ASSP-33:443–445
Google Scholar
Loizou PC, Rangachari S (2006) A noise-estimation algorithm for highly non-stationary environments. Speech Commun 48:220–231
Article Google Scholar
Wilson KW, Smaragdis P, Raj B (2008) Regularized non-negative matrix factorization with temporal dependencies for speech denoising. Interspeech, pp 411–414
Google Scholar
Smaragdis P, Mohammadiha N, Leijon A (2013) Supervised and unsupervised speech enhancement using nonnegative matrix factorization. IEEE Trans Audio Speech Lang Process 21(10):2140–2151
Google Scholar
Mohammadiha N, Leijon A, Gerkmann T (2011) A new linear MMSE filter for single channel speech enhancement based on nonnegative matrix factorization. In: 2011 IEEE workshop on applications of signal processing to audio and acoustics (WASPAA), pp 45–48
Google Scholar
Lee SJ, Park JH, Kim HK, Kim SM, Lee YK (2012) Non-negative matrix factorization based noise reduction for noise robust automatic speech recognition. Lect Notes Comput Sci 7191:338–346
Article Google Scholar
Rinaldo R, Canazza S, Montessoro PL, Cabras G (2010) Restoration of audio documents with low SNR: a NMF parameter estimation and perceptually motivated bayesian suppression rule. In: Proceedings of sound and music computing conference, pp 314–321
Google Scholar
Hyekyoung Lee N, Eungjin Choi AC, Kim Y-D (2008) Nonnegative matrix factorization with α–divergence. Pattern Recognit Lett 29(9):1433–1440
Google Scholar
Kwon K, Kim NS, Shin JW (2014) Speech enhancement combining statistical models and NMF with update of speech and noise bases. In: IEEE international conference on acoustics, speech and signal processing, 4–9 May. Florence, Italy, pp 7053–7057
Google Scholar
Garofolo JS (1988) Getting started with the DARPA TIMIT CD-ROM: an acoustic phonetic continuous speech database. National Institute of Standards and Technology (NIST), Gaithersburg, MD, USA
Google Scholar
Steeneken H, Varga A (1993) Assessment for automatic speech recognition: II. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun 12:247–251
Article Google Scholar
Durrieu J-L, Fevotte C, Bertin N (2009) Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis. Neural Comput 21(3):793–830
Article Google Scholar
Ephraim Y, Malah D (1984) Speech enhancement using a minimum mean square error short-time spectral amplitude estimator. IEEE Trans Acoust Speech Signal Process 32(6):1109–1121
Article Google Scholar
Browne M, Berry MW, Langville AN, Plemmons RJ, Pauca VP (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52(1):155–173
Article MathSciNet Google Scholar
Loizou P, Hu Y (2008) Evaluation of objective quality measures for speech enhancement. IEEE Trans. Speech Audio Process 16(1):229–238
Article Google Scholar
Vincent E, Fevotte C, Gribonval R (2006) Performance measurement in blind audio source separation. IEEE Trans Audio Speech Lang Process 14(4):1462–1469
Article Google Scholar

Download references

Author information

Authors and Affiliations

Vellore Institute of Technology AP, Amaravathi, Andhra Pradesh, India
V. Sunnydayal
Institute of Aeronautical Engineering, Hyderabad, India
J. Sirisha Devi
Model Based Design, ABU, Tate Elxsi Limited, Bangalore, India
Siva Prasad Nandyala

Authors

V. Sunnydayal
View author publications
You can also search for this author in PubMed Google Scholar
J. Sirisha Devi
View author publications
You can also search for this author in PubMed Google Scholar
Siva Prasad Nandyala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. Sunnydayal .

Editor information

Editors and Affiliations

Guru Nanak Institutions, Ibrahimpatnam, Telangana, India
H. S. Saini
Guru Nanak Institiutions, Ibrahimpatnam, Telangana, India
Rishi Sayal
JNTUH College of Engineering Hyderabad, Hyderabad, Telangana, India
Aliseri Govardhan
CLOUDS Laboratory, The University of Melbourne, Melbourne, VIC, Australia
Rajkumar Buyya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sunnydayal, V., Sirisha Devi, J., Nandyala, S.P. (2019). Hybrid Method for Speech Enhancement Using α-Divergence. In: Saini, H., Sayal, R., Govardhan, A., Buyya, R. (eds) Innovations in Computer Science and Engineering. Lecture Notes in Networks and Systems, vol 74. Springer, Singapore. https://doi.org/10.1007/978-981-13-7082-3_48

Download citation

DOI: https://doi.org/10.1007/978-981-13-7082-3_48
Published: 19 June 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7081-6
Online ISBN: 978-981-13-7082-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Hybrid Method for Speech Enhancement Using α-Divergence

Abstract

Access this chapter

Similar content being viewed by others

An Iterative Posterior Regularized NMF-Based Adaptive Wiener Filter for Speech Enhancement

A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence

Sparse NMF based speech enhancement with bases update

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Hybrid Method for Speech Enhancement Using α-Divergence

Abstract

Access this chapter

Similar content being viewed by others

An Iterative Posterior Regularized NMF-Based Adaptive Wiener Filter for Speech Enhancement

A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence

Sparse NMF based speech enhancement with bases update

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation