Beamforming Initialization and Data Prewhitening in Natural Gradient Convolutive Blind Source Separation of Speech Mixtures

Gupta, Malay; Douglas, Scott C.

doi:10.1007/978-3-540-74494-8_58

Malay Gupta¹ &
Scott C. Douglas¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4666))

Included in the following conference series:

International Conference on Independent Component Analysis and Signal Separation

2932 Accesses
4 Citations

Abstract

Successful speech enhancement by convolutive blind source separation (BSS) techniques requires careful design of all aspects of the chosen separation method. The conventional strategy for system initialization in both time- and frequency-domain BSS involves a diagonal center-spike FIR filter matrix and no data preprocessing; however, this strategy may not be the best for any chosen separation algorithm. In this paper, we experimentally evaluate two different approaches for potentially-improving the performance of time-domain and frequency-domain natural gradient speech separation algorithms – prewhitening of the signal mixtures, and delay-and-sum beamforming initialization for the separation system – to determine which of the two classes of algorithms benefit most from them. Our results indicate that frequency-domain-based natural gradient BSS methods generally need geometric information about the system to obtain any reasonable separation quality. For time-domain natural gradient separation algorithms, either beamforming initialization or prewhitening improves separation performance, particularly for larger-scale problems involving three or more sources and sensors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Smaragdis, P.: Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1-3), 21–34 (1998)
Article MATH Google Scholar
Parra, L., Spence, C.: Convolutive blind separation of non-stationary sources. IEEE Trans. Speech Audio Processing 8, 320–327 (2000)
Article Google Scholar
Parra, L., Alvino, C.: Geometric source separation: Merging convolutve source separation with geometric beamforming. IEEE Trans. Speech Audio Processing 10(6), 352–362 (2002)
Article Google Scholar
Mitianoudis, N., Davies, M.E.: Audio source separation of convolutive mixtures. IEEE Trans. Speech Audio Processing 11, 489–497 (2003)
Article Google Scholar
Sawada, H., Mukai, R., Araki, S., Makino, S.: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans. Speech Audio Processing 12, 530–538 (2004)
Article Google Scholar
Saruwatari, H., Kawamura, T., Nishikawa, T., Lee, A., Shikano, K.: Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Trans. Audio Speech Language Processing 14, 666–678 (2006)
Article Google Scholar
Amari, S., Douglas, S.C., Chichocki, A., Yang, H.H.: Multichannel blind deconvolution and equalization using the natural gradient. In: Proc. IEEE Workshop Signal Proc. Adv. Wireless Comm. Paris, France, April 1997, pp. 101–104. IEEE Computer Society Press, Los Alamitos (1997)
Chapter Google Scholar
Douglas, S.C., Sawada, H., Makino, S.: Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters. IEEE Trans. Speech Audio Processing 13, 92–104 (2005)
Article Google Scholar
Douglas, S.C., Sawada, H., Makino, S.: A spatio-temporal FastICA algorithm for separating convolutive mixtures. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, Philadelphia, PA, vol. 5, pp. 165-168 (March 2005)
Google Scholar
Douglas, S.C., Gupta, M., Sawada, H., Makino, S.: Spatio-temporal FastICA algorithms for the blind separation of convolutive mixtures. IEEE Trans. Speech Audio Language Processing, 15(5) (July 2007)
Google Scholar
Douglas, S.C., Gupta, M.: Scaled natural gradient algorithms for instantaneous and convolutive blind source separation. In: IEEE Int. Conf. Acoust. Speech, Signal Processing, Honolulu, HI (to appear, April 2007)
Google Scholar
Araki, S., Makino, S., Hinamoto, Y., Mukai, R., Nishikawa, T., Saruwatari, H.: Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures. EURASIP J. Applied Signal Processing 2003(11), 1157–1166 (2003)
Article MATH Google Scholar
Douglas, S.C., Cichocki, A.: Neural networks for blind decorrelation of signals. IEEE Trans. Signal Processing 45, 2829–2842 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Southern Methodist University, Dallas, Texas 75275, USA
Malay Gupta & Scott C. Douglas

Authors

Malay Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Scott C. Douglas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Mike E. Davies Christopher J. James Samer A. Abdallah Mark D Plumbley

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gupta, M., Douglas, S.C. (2007). Beamforming Initialization and Data Prewhitening in Natural Gradient Convolutive Blind Source Separation of Speech Mixtures. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds) Independent Component Analysis and Signal Separation. ICA 2007. Lecture Notes in Computer Science, vol 4666. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74494-8_58

Download citation

DOI: https://doi.org/10.1007/978-3-540-74494-8_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74493-1
Online ISBN: 978-3-540-74494-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics