Skip to main content

Beamforming Initialization and Data Prewhitening in Natural Gradient Convolutive Blind Source Separation of Speech Mixtures

  • Conference paper
Independent Component Analysis and Signal Separation (ICA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4666))

Abstract

Successful speech enhancement by convolutive blind source separation (BSS) techniques requires careful design of all aspects of the chosen separation method. The conventional strategy for system initialization in both time- and frequency-domain BSS involves a diagonal center-spike FIR filter matrix and no data preprocessing; however, this strategy may not be the best for any chosen separation algorithm. In this paper, we experimentally evaluate two different approaches for potentially-improving the performance of time-domain and frequency-domain natural gradient speech separation algorithms – prewhitening of the signal mixtures, and delay-and-sum beamforming initialization for the separation system – to determine which of the two classes of algorithms benefit most from them. Our results indicate that frequency-domain-based natural gradient BSS methods generally need geometric information about the system to obtain any reasonable separation quality. For time-domain natural gradient separation algorithms, either beamforming initialization or prewhitening improves separation performance, particularly for larger-scale problems involving three or more sources and sensors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Smaragdis, P.: Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1-3), 21–34 (1998)

    Article  MATH  Google Scholar 

  2. Parra, L., Spence, C.: Convolutive blind separation of non-stationary sources. IEEE Trans. Speech Audio Processing 8, 320–327 (2000)

    Article  Google Scholar 

  3. Parra, L., Alvino, C.: Geometric source separation: Merging convolutve source separation with geometric beamforming. IEEE Trans. Speech Audio Processing 10(6), 352–362 (2002)

    Article  Google Scholar 

  4. Mitianoudis, N., Davies, M.E.: Audio source separation of convolutive mixtures. IEEE Trans. Speech Audio Processing 11, 489–497 (2003)

    Article  Google Scholar 

  5. Sawada, H., Mukai, R., Araki, S., Makino, S.: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans. Speech Audio Processing 12, 530–538 (2004)

    Article  Google Scholar 

  6. Saruwatari, H., Kawamura, T., Nishikawa, T., Lee, A., Shikano, K.: Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Trans. Audio Speech Language Processing 14, 666–678 (2006)

    Article  Google Scholar 

  7. Amari, S., Douglas, S.C., Chichocki, A., Yang, H.H.: Multichannel blind deconvolution and equalization using the natural gradient. In: Proc. IEEE Workshop Signal Proc. Adv. Wireless Comm. Paris, France, April 1997, pp. 101–104. IEEE Computer Society Press, Los Alamitos (1997)

    Chapter  Google Scholar 

  8. Douglas, S.C., Sawada, H., Makino, S.: Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters. IEEE Trans. Speech Audio Processing 13, 92–104 (2005)

    Article  Google Scholar 

  9. Douglas, S.C., Sawada, H., Makino, S.: A spatio-temporal FastICA algorithm for separating convolutive mixtures. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, Philadelphia, PA, vol. 5, pp. 165-168 (March 2005)

    Google Scholar 

  10. Douglas, S.C., Gupta, M., Sawada, H., Makino, S.: Spatio-temporal FastICA algorithms for the blind separation of convolutive mixtures. IEEE Trans. Speech Audio Language Processing, 15(5) (July 2007)

    Google Scholar 

  11. Douglas, S.C., Gupta, M.: Scaled natural gradient algorithms for instantaneous and convolutive blind source separation. In: IEEE Int. Conf. Acoust. Speech, Signal Processing, Honolulu, HI (to appear, April 2007)

    Google Scholar 

  12. Araki, S., Makino, S., Hinamoto, Y., Mukai, R., Nishikawa, T., Saruwatari, H.: Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures. EURASIP J. Applied Signal Processing 2003(11), 1157–1166 (2003)

    Article  MATH  Google Scholar 

  13. Douglas, S.C., Cichocki, A.: Neural networks for blind decorrelation of signals. IEEE Trans. Signal Processing 45, 2829–2842 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Mike E. Davies Christopher J. James Samer A. Abdallah Mark D Plumbley

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gupta, M., Douglas, S.C. (2007). Beamforming Initialization and Data Prewhitening in Natural Gradient Convolutive Blind Source Separation of Speech Mixtures. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds) Independent Component Analysis and Signal Separation. ICA 2007. Lecture Notes in Computer Science, vol 4666. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74494-8_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74494-8_58

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74493-1

  • Online ISBN: 978-3-540-74494-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics