Advertisement

Convolutive Blind Source Separation for Noisy Mixtures

  • Robert Aichner
  • Herbert Buchner
  • Walter Kellermann
Part of the Signals and Communication Technology book series (SCT)

Convolutive blind source separation (BSS) is a promising technique for separating acoustic mixtures acquired by multiple microphones in reverberant environments. In contrast to conventional beamforming methods no a-priori knowledge about the source positions or sensor arrangement is necessary resulting in a greater versatility of the algorithms. In this contribution we will first review a general BSS framework called TRINICON which allows a unified treatment of broadband and narrowband BSS algorithms. Efficient algorithms will be presented and their high performance will be confirmed by experimental results in reverberant rooms. Subsequently, the BSS model will be extended by incorporating background noise. Commonly encountered realistic noise types are examined and, based on the resulting model, pre-processing methods for noise-robust BSS adaptation are investigated. Additionally, an efficient post-processing technique following the BSS stage, will be presented, which aims at simultaneous suppression of background noise and residual cross-talk. Combining these pre- or post-processing approaches with the algorithms obtained by the TRINICON framework yield versatile BSS systems which can be applied in adverse environments as will be demonstrated by experimental results.

Keywords

Discrete Fourier Transform Blind Source Separation Acoustic Echo Cancellation Convolutive Mixture Magnitude Square Coherence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    M. Abramowitz, I.A. Stegun (eds.): Handbook of Mathematical Functions, New York, NY, USA: Dover Publications, 1972.zbMATHGoogle Scholar
  2. 2.
    R. Aichner, S. Araki, S. Makino, T. Nishikawa, H. Saruwatari: Time-domain blind source separation of non-stationary convolved signals by utilizing geometric beamforming, Proc. NNSP ’02, 445–454, Martigny, Switzerland, September 2002.Google Scholar
  3. 3.
    R. Aichner, H. Buchner, W. Kellermann: Convolutive blind source separation for noisy mixtures, Proc. CFA/DAGA ’04, 583–584, Strasbourg, France, March 2004.Google Scholar
  4. 4.
    R. Aichner, H. Buchner, W. Kellermann: On the causality problem in time-domain blind source separation and deconvolution algorithms, Proc. ICASSP ’05, 5, 181–184, Philadelphia, PA, USA, March 2005.Google Scholar
  5. 5.
    R. Aichner, M. Zourub, H. Buchner, W. Kellermann: Post-processing for convolutive blind source separation, Proc. ICASSP ’06, 5, 37–40, Toulouse, France, May 2006.Google Scholar
  6. 6.
    R. Aichner, H. Buchner, F. Yan, W. Kellermann: A real-time blind source separation scheme and its application to reverberant and noisy acoustic environments, Signal Processing, 86(6), 1260–1277, June 2006.zbMATHCrossRefGoogle Scholar
  7. 7.
    R. Aichner, H. Buchner, W. Kellermann: Exploiting narrowband efficiency for broadband convolutive blind source separation, EURASIP Journal on Applied Signal Processing, 1–9, September 2006.Google Scholar
  8. 8.
    R. Aichner: Acoustic Blind Source Separation in Reverberant and Noisy Environments, PhD thesis, Universität Erlangen-Nürnberg, Erlangen, Germany, 2007.Google Scholar
  9. 9.
    S.-I. Amari: Natural gradient works efficiently in learning, Neural Computation, 10, 251–276, 1998.CrossRefGoogle Scholar
  10. 10.
    S. Araki, R. Mukai, S. Makino, T. Nishikawa, H. Saruwatari: The fundamental limitation of frequency-domain blind source separation for convolutive mixtures of speech, IEEE Trans. Speech Audio Processing, 11(2), 109–116, March 2003.CrossRefGoogle Scholar
  11. 11.
    B. Ayad, G. Faucon: Acoustic echo and noise cancelling for hands-free communication systems, Proc. IWAENC ’95, 91–94, Røros, Norway, June 1995.Google Scholar
  12. 12.
    M. Berouti, R. Schwartz, J. Makhoul: Enhancement of speech corrupted by acoustic noise, Proc. ICASSP ’79, 208–211, April 1979.Google Scholar
  13. 13.
    H. Brehm, W. Stammler: Description and generation of spherically invariant speech-model signals, Signal Processing, 12, 119–141, 1987.CrossRefGoogle Scholar
  14. 14.
    H. Buchner, R. Aichner, W. Kellermann: Blind source separation algorithms for convolutive mixtures exploiting nongaussianity, nonwhiteness, and nonstationarity, Proc. IWAENC ’03, 275–278, Kyoto, Japan, September 2003.Google Scholar
  15. 15.
    H. Buchner, R. Aichner, W. Kellermann: TRINICON: A versatile framework for multichannel blind signal processing, Proc. ICASSP’ 04, 3, 889–892, Montreal, Canada, May 2004.Google Scholar
  16. 16.
    H. Buchner, R. Aichner, W. Kellermann: Blind source separation for convolutive mixtures: A unified treatment, in J. Benesty, Y. Huang (eds.), Audio Signal Processing for Next-Generation Multimedia Communication Systems, 255–293, Boston, MA, USA: Kluwer, 2004.CrossRefGoogle Scholar
  17. 17.
    H. Buchner, R. Aichner, W. Kellermann: A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics, IEEE Trans. Speech Audio Processing, 13(1), 120–134, January 2005.CrossRefGoogle Scholar
  18. 18.
    H. Buchner, R. Aichner, J. Stenglein, H. Teutsch, W. Kellermann: Simultaneous localization of multiple sound sources using blind adaptive MIMO filtering, Proc. ICASSP ’05, 3, 97–100, Philadelphia, PA, USA, March 2005.Google Scholar
  19. 19.
    H. Buchner, J. Benesty, W. Kellermann: Generalized multichannel frequency-domain adaptive filtering: Efficient realization and application to hands-free speech communication, Signal Processing, 85, 549–570, 2005.zbMATHCrossRefGoogle Scholar
  20. 20.
    H. Buchner, R. Aichner, W. Kellermann: TRINICON-based blind system identification with application to multiple-source localization and separation, in S. Makino, T.-W. Lee, S. Sawada (eds.), Blind Speech Separation, Berlin, Germany: Springer, 2007.Google Scholar
  21. 21.
    J.-F. Cardoso, B.H. Laheld: Equivariant adaptive source separation, IEEE Trans. Signal Processing, 44(12), 3017–3030, December 1996.CrossRefGoogle Scholar
  22. 22.
    C. Choi, G.-J. Jang, Y. Lee, S. R. Kim: Adaptive cross-channel interference cancellation on blind source separation outputs, Proc. ICA ’04, 857–864, Granada, Spain, September 2004.Google Scholar
  23. 23.
    A. Cichocki, R. Unbehauen: Neural Networks for Optimization and Signal Processing, Chichester, USA: Wiley, 1994.Google Scholar
  24. 24.
    A. Cichocki, S. Douglas, S.-I. Amari: Robust techniques for independent component analysis (ICA) with noisy data, Neurocomputing, 22, 113–129, 1998.zbMATHCrossRefGoogle Scholar
  25. 25.
    A. Cichocki, S.-I. Amari: Adaptive Blind Signal and Image Processing, Chichester, USA: Wiley, 2002.CrossRefGoogle Scholar
  26. 26.
    R. K. Cook, R. V. Waterhouse, R. D. Berendt, S. Edelman, M.C. Thompson, Jr.: Measurement of correlation coefficients in reverberant sound fields, JASA, 27(6), 1072–1077, November 1955.Google Scholar
  27. 27.
    T. M. Cover, J. A. Thomas: Elements of Information Theory, New York, NY, USA: Wiley, 1991.zbMATHCrossRefGoogle Scholar
  28. 28.
    W. B. Davenport: An experimental study of speech wave propability distribution, JASA, 24(4), 390–399, 1952.Google Scholar
  29. 29.
    P. Divenyi (ed.): Speech Separation by Humans and Machines, Norwell, MA, USA: Kluwer, 2005.Google Scholar
  30. 30.
    S. C. Douglas, A. Cichocki, S.-I. Amari: A bias removal technique for blind source separation with noisy measurements, Electronic Letters, 34(14), 1379–1380, July 1998.CrossRefGoogle Scholar
  31. 31.
    T. Eltoft, T. Kim, T.-W. Lee: On the multivariate Laplace distribution, IEEE Signal Processing Lett., 13(5), 300–303, May 2006.CrossRefGoogle Scholar
  32. 32.
    G. Enzner, R. Martin, P. Vary: Partitioned residual echo power estimation for frequency-domain acoustic echo cancellation and postfiltering, Eur. Trans. Telecommun., 13(2), 103–114, 2002.CrossRefGoogle Scholar
  33. 33.
    C. L. Fancourt, L. Parra: The coherence function in blind source separation of convolutive mixtures of non-stationary signals, Proc. NNSP ’01, 303–312, 2001.Google Scholar
  34. 34.
    S. Gazor, W. Zhang: Speech propability distribution, IEEE Signal Processing Lett., 10(7), 204–207, July 2003.CrossRefGoogle Scholar
  35. 35.
    J. Goldman: Detection in the presence of spherically symmetric random vectors, IEEE Trans. Inform. Theory, 22(1), 52–59, January 1976.zbMATHCrossRefGoogle Scholar
  36. 36.
    J. E. Greenberg, P. M. Zurek: Evaluation of an adaptive beamforming method for hearing aids, JASA, 91(3), 1662–1676, March 1992.Google Scholar
  37. 37.
    J. E. Greenberg: Modified LMS algorithms for speech processing with an adaptive noise canceller, IEEE Trans. Speech Audio Processing, 6(4), 338–351, 1998.CrossRefGoogle Scholar
  38. 38.
    D. W. Griffin, J. S. Lim: Signal estimation from modified short-time fourier transform, IEEE Trans. Acoust., Speech, Signal Processing, ASSP-32(2), 236–243, April 1984.CrossRefGoogle Scholar
  39. 39.
    E. Hänsler, G. Schmidt: Acoustic Echo and Noise Control: A Practical Approach, Hoboken, NJ, USA: Wiley, 2004.CrossRefGoogle Scholar
  40. 40.
    D. A. Harville: Matrix Algebra from a Statistician’s Perspective, Berlin, Germany: Springer, 1997.zbMATHGoogle Scholar
  41. 41.
    S. Haykin: Adaptive Filter Theory, 4th ed., Englewood Cliffs, NJ, USA: Prentice-Hall, 2002.Google Scholar
  42. 42.
    W. Herbordt: Sound Capture for Human/Machine Interfaces – Practical Aspects of Microphone Array Signal Processing, volume 315 of Lecture Notes in Control and Information Sciences, Berlin, Germany: Springer, 2005.zbMATHGoogle Scholar
  43. 43.
    A. Hiroe: Solution of permutation problem in frequency domain ICA, using multivariate probability density functions. Proc. ICA ’06, 601–608, Charleston, SC, USA, March 2006.Google Scholar
  44. 44.
    O. Hoshuyama, A. Sugiyama: An adaptive microphone array with good sound quality using auxiliary fixed beamformers and its DSP implementation, Proc. ICASSP ’99, 949–952, Phoenix, AZ, USA, March 1999.Google Scholar
  45. 45.
    R. Hu, Y. Zhao: Adaptive decorrelation filtering algorithm for speech source separation in uncorrelated noises, Proc. ICASSP ’05, 1, 1113–1115, Philadelphia, PA, USA, May 2005.MathSciNetGoogle Scholar
  46. 46.
    R. Hu, Y. Zhao: Fast noise compensation for speech separation in diffuse noise, Proc. ICASSP ’06, 5, 865–868, Toulouse, France, May 2006.Google Scholar
  47. 47.
    T. P. Hua, A. Sugiyama, R. Le Bouquin Jeannes, G. Faucon: Estimation of the signal-to-interference ratio based on normalized cross-correlation with symmetric leaky blocking matrices in adaptive microphone arrays, Proc. IWAENC ’06, 1–4, Paris, France, September 2006.Google Scholar
  48. 48.
    A. Hyvaerinen, J. Karhunen, E. Oja: Independent Component Analysis, New York, NY, USA: Wiley, 2001.CrossRefGoogle Scholar
  49. 49.
    S. Ikeda, N. Murata: A method of ICA in time-frequency-domain, Proc. ICA ’99, 365–371, January 1999.Google Scholar
  50. 50.
    M. Kawamoto, K. Matsuoka, N. Ohnishi: A method of blind separation for convolved non-stationary signals, Neurocomputing, 22, 157–171, 1998.zbMATHCrossRefGoogle Scholar
  51. 51.
    W. Kellermann, H. Buchner, R. Aichner: Separating convolutive mixtures with TRINICON, Proc. ICASSP ’06, 5, 961–964, Toulouse, France, May 2006.Google Scholar
  52. 52.
    T. Kim, T. Eltoft, T.-W. Lee: Independent vector analysis: An extension of ICA to multivariate components, Proc.ICA ’06, 175–172, Charleston, SC, USA, March 2006.Google Scholar
  53. 53.
    S. Kotz, T. Kozubowski, K. Podgorski: The Laplace Distribution and Generalizations, Basel, Switzerland: Birkhäuser Verlag, 2001.zbMATHGoogle Scholar
  54. 54.
    B. S. Krongold, D.L. Jones: Blind source separation of nonstationary convolutively mixed signals, Proc. SSAP ’00, 53–57, Pocono Manor, PA, USA, August 2000.Google Scholar
  55. 55.
    S. Kurita, H. Saruwatari, S. Kajita, K. Takeda, F. Itakura: Evaluation of blind signal separation method using directivity pattern under reverberant conditions, Proc. ICASSP ’00, 5, 3140–3143, Istanbul, Turkey, June 2000.Google Scholar
  56. 56.
    H. Kuttruff: Room Acoustics, 4th ed., London, GB: Spon Press, 2000.Google Scholar
  57. 57.
    S. Y. Low, S. Nordholm, R. Tognieri: Convolutive blind signal separation with post-processing, IEEE Trans. Speech Audio Processing, 12(5), 539–548, September 2004.CrossRefGoogle Scholar
  58. 58.
    J. D. Markel, A. H. Gray: Linear Prediction of Speech, Berlin, Germany: Springer, 1976.zbMATHGoogle Scholar
  59. 59.
    R. Martin, J. Altenhöner: Coupled adaptive filters for acoustic echo control and noise reduction, Proc. ICASSP ’95, 3043–3046, Detroit, MI, USA, May 1995.Google Scholar
  60. 60.
    R. Martin: Freisprecheinrichtungen mit mehrkanaliger Echokompensation und Störgeräuschreduktion, PhD thesis, RWTH Aachen, Aachen, Germany, June 1995 (in German).Google Scholar
  61. 61.
    R. Martin: The echo shaping approach to acoustic echo control, Speech Communication, 20, 181–190, 1996.CrossRefGoogle Scholar
  62. 62.
    R. Martin: Small microphone arrays with postfilters for noise and acoustic echo reduction, in M. Brandstein, D. Ward (eds.), Microphone Arrays: Signal Processing Techniques and Applications, 255–279, Berlin, Germany: Springer, 2001.Google Scholar
  63. 63.
    R. Martin: Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech Audio Processing, 9(5), 504–512, July 2001.CrossRefGoogle Scholar
  64. 64.
    K. Matsuoka, M. Ohya, M. Kawamoto: Neural net for blind separation of nonstationary signals, IEEE Trans. Neural Networks, 8(3), 411–419, 1995.Google Scholar
  65. 65.
    K. Matsuoka, S. Nakashima: Minimal distortion principle for blind source separation, Proc. ICA ’01, 722–727, San Diego, CA, USA, December 2001.Google Scholar
  66. 66.
    M. Miyoshi, Y. Kaneda: Inverse filtering of room acoustics, IEEE Trans. Acoust., Speech, Signal Processing, 36(2), 145–152, February 1988.CrossRefGoogle Scholar
  67. 67.
    L. Molgedey, H. G. Schuster: Separation of a mixture of independent signals using time delayed correlations, Physical Review Letters, 72, 3634–3636, 1994.CrossRefGoogle Scholar
  68. 68.
    R. Mukai, S. Araki, H. Sawada, S. Makino: Removal of residual cross-talk components in blind source separation using time-delayed spectral subtraction, Proc. ICASSP ’02, 2, 1789–1792, Orlando, FL, USA, May 2002.Google Scholar
  69. 69.
    R. Mukai, S. Araki, H. Sawada, S. Makino: Removal of residual cross-talk components in blind source separation using LMS filters, Proc. NNSP ’02, 435–444, Martigny, Switzerland, September 2002.Google Scholar
  70. 70.
    T. Nishikawa, H. Saruwatari, K. Shikano: Comparison of time-domain ICA, frequency-domain ICA and multistage ICA for blind source separation, Proc. EUSIPCO 03, 2, 15–18, September 2002.Google Scholar
  71. 71.
    A. Papoulis: Probability, Random Variables, and Stochastic Processes, 4th ed., Boston, MA, USA: McGraw-Hill, 2002.Google Scholar
  72. 72.
    K. S. Park, J. S. Park, K. S. Son, H. T. Kim: Postprocessing with Wiener filtering technique for reducing residual crosstalk in blind source separation, IEEE Signal Processing Lett., 13(12), 749–751, December 2006.CrossRefGoogle Scholar
  73. 73.
    L. Parra, C. Spence: Convolutive blind source separation of non-stationary sources, IEEE Trans. Speech Audio Processing, 8(3), 320–327, May 2000.CrossRefGoogle Scholar
  74. 74.
    L. Parra, C. Spence, P. Sajda: Higher-order statistical properties arising from the non-stationarity of natural signals, Advances in Neural Information Processing Systems, 13, 786–792, Cambridge, MA, USA: MIT Press, 2000.Google Scholar
  75. 75.
    H. Sawada, R. Mukai, S. de la Kethulle de Ryhove, S. Araki, S. Makino: Spectral smoothing for frequency-domain blind source separation, Proc. IWAENC ’03, 311–314, Kyoto, Japan, September 2003.Google Scholar
  76. 76.
    K. U. Simmer, J. Bitzer, C. Marro: Post-filtering techniques, in M. Brandstein, D. Ward (eds.), Microphone Arrays: Signal Processing Techniques and Applications, 39–60, Berlin, Germany: Springer, 2001.Google Scholar
  77. 77.
    P. Smaragdis: Blind separation of convolved mixtures in the frequency domain, Neurocomputing, 22, 21–34, 1998.zbMATHCrossRefGoogle Scholar
  78. 78.
    L. Tong, R.-W. Liu, V.C. Soon, Y.-F. Huang: Indeterminacy and identifiability of blind identification, IEEE Trans. on Circuits and Systems, 38(5), 499–509, May 1991.zbMATHCrossRefGoogle Scholar
  79. 79.
    V. Turbin, A. Gilloire, P. Scalart, C. Beaugeant: Using psychoacoustic criteria in acoustic echo cancellation algorithms, Proc. IWAENC ’97, 53–56, London, UK, September 1997.Google Scholar
  80. 80.
    J.-M. Valin, J. Rouat, F. Michaud: Microphone array post-filter for separation of simultaneous non-stationary sources, Proc. ICASSP ’04, 1, 221–224, Montreal, Canada, May 2004.Google Scholar
  81. 81.
    S. Van Gerven, D. Van Compernolle: Signal separation by symmetric adaptive decorrelation: Stability, convergence and uniqueness, IEEE Trans. Signal Processing, 43(7), 1602–1612, July 1995.CrossRefGoogle Scholar
  82. 82.
    E. Visser, T.-W. Lee: Speech enhancement using blind source separation and two-channel energy based speaker detection, Proc. ICASSP ’03, 1, 836–839, HongKong, April 2003.Google Scholar
  83. 83.
    D. Wang and J. Lim: The unimportance of phase in speech enhancement, IEEE Trans. Acoust., Speech, Signal Processing, ASSP-30(4), 679–681, August 1982.CrossRefGoogle Scholar
  84. 84.
    D. Wang, G. J. Brown (eds.): Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, New York, NY, USA: Wiley, 2006.Google Scholar
  85. 85.
    E. Weinstein, M. Feder, A. Oppenheim: Multi-channel signal separation by decorrelation, IEEE Trans. Speech Audio Processing, 1(4), 405–413, October 1993.CrossRefGoogle Scholar
  86. 86.
    B. Widrow, J. Glover, J. MacCool, J. Kautnitz, C. Williams, R. Hearn, J. Zeidler, E. Dong, R. Goodlin: Adaptive noise cancelling: principles and applications, Proc. IEEE, 63, 1692–1716, 1975.CrossRefGoogle Scholar
  87. 87.
    S. Winter, W. Kellermann, H. Sawada, S. Makino: MAP-based underdetermined blind source separation of convolutive mixtures by hierarchical clustering and 1-norm minimization, EURASIP Journal on Applied Signal Processing, 1–12, 2007.Google Scholar
  88. 88.
    H.-C. Wu, J. C. Principe: Simultaneous diagonalization in the frequency domain (SDIF) for source separation, Proc. ICA ’99, 245–250, Aussois, France, December 1999.Google Scholar
  89. 89.
    K. Yao: A representation theorem and its applications to spherically-invariant random processes, IEEE Trans. Inform. Theory, 19(5), 600–608, September 1973.zbMATHCrossRefGoogle Scholar
  90. 90.
    O. Yilmaz, S. Rickard: Blind separation of speech mixtures via time-frequency masking, IEEE Trans. Signal Processing, 52(7), 1830–1847, July 2004.CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Robert Aichner
    • 1
  • Herbert Buchner
    • 2
  • Walter Kellermann
    • 3
  1. 1.Microsoft CorporationRedmondUSA
  2. 2.Deutsche Telekom LaboratoriesTechnical University BerlinGermany
  3. 3.University of Erlangen-NurembergGermany

Personalised recommendations