Blind Speech Separation pp 101-147 | Cite as

# TRINICON-based Blind System Identification with Application to Multiple-Source Localization and Separation

This contribution treats blind system identification approaches and how they can be used to localize multiple sources in environments where multipath propagation cannot be neglected, e.g., acoustic sources in reverberant environments. Based on TRINICON, a general framework for broadband adaptive MIMO signal processing, we first derive a versatile blind MIMO system identification method. For this purpose, the basics of TRINICON will be reviewed to the extent needed for this application, and some new algorithmic aspects will be emphasized. The generic approach then allows us to study various illustrative relations to other algorithms and applications. In particular, it is shown that the optimization criteria used for blind system identification allow a generalization of the well-known Adaptive Eigenvalue Decomposition (AED) algorithm for source localization: Instead of one source as with AED, several sources can be localized simultaneously. Performance evaluation in realistic scenarios will show that this method compares favourably with other state-of-the-art methods for source localization.

## Keywords

Independent Component Analysis Blind Source Separation Blind Signal Microphone Array Sylvester Matrix## Preview

Unable to display preview. Download preview PDF.

## References

- 1.H. Buchner, R. Aichner, and W. Kellermann, “Blind source separation for convolutive mixtures exploiting nongaussianity, nonwhiteness, and nonstation-arity,” in Proc. Int. Workshop Acoustic Echo and Noise Control (IWAENC), Kyoto, Japan, pp. 223-226, Sept. 2003.Google Scholar
- 2.H. Buchner, R. Aichner, and W. Kellermann, “Blind source separation for convolutive mixtures: A unified treatment,” in Y. Huang and J. Benesty (eds.), Audio Signal Processing for Next-Generation Multimedia Communica-tion Systems, Kluwer Academic Publishers, Boston, pp. 255-293, Feb. 2004.Google Scholar
- 3.H. Buchner, R. Aichner, and W. Kellermann, “TRINICON: A versatile frame-work for multichannel blind signal processing,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Montreal, Canada, vol. 3, pp. 889-892, May 2004.Google Scholar
- 4.H. Buchner, R. Aichner, and W. Kellermann, “A generalization of blind source separation algorithms for convolutive mixtures based on second-order statis-tics,” IEEE Trans. Speech Audio Process., vol. 13, no. 1, pp. 120-134, Jan. 2005.CrossRefGoogle Scholar
- 5.H. Buchner, R. Aichner, and W. Kellermann, “Relation between blind system identification and convolutive blind source separation,” in Proc. Joint Work-shop Hands-Free Speech Communication and Microphone Arrays (HSCMA), Piscataway, NJ, USA, Mar. 2005 (additional presentation slides with more results downloadable from the web site www.LNT.de/lms/).
- 6.H. Buchner, R. Aichner, J. Stenglein, H. Teutsch, and W. Kellermann, “Simultaneous localization of multiple sound sources using blind adaptive MIMO filtering,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Philadelphia, PA, USA, Mar. 2005.Google Scholar
- 7.M. Hofbauer, Optimal Linear Separation and Deconvolution of Acoustical Con-volutive Mixtures, Dissertation, Hartung-Gorre Verlag, Konstanz, May 2005.Google Scholar
- 8.C.H. Knapp and G.C. Carter, “The generalized correlation method for esti-mation of time delay,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-24, pp. 320-327, Aug. 1976.CrossRefGoogle Scholar
- 9.M.S. Brandstein and D.B. Ward, Microphone Arrays: Signal Processing Tech-niques and Applications, Springer, Berlin, 2001.Google Scholar
- 10.R.O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. Antennas Propagation, vol. AP-34, no. 3, pp. 276-280, Mar. 1986.CrossRefGoogle Scholar
- 11.R. Roy and T. Kailath, “ESPRIT - estimation of signal parameters via rota-tional invariance techniques,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37. no. 7, pp. 984-995, July 1989.CrossRefGoogle Scholar
- 12.B. Champagne, S. Bedard, and A. Stéphenne, “Performance of time-delay estimation in the presence of room reverberation,” IEEE Trans. Speech Audio Process., vol. 4, pp. 148-152, Mar. 1996.CrossRefGoogle Scholar
- 13.J.P. Ianniello, “Time delay estimation via cross-correlation in the presence of large estimation errors,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, no. 6, pp. 998-1003, Dec. 1982.Google Scholar
- 14.J. Scheuing and B. Yang, “Disambiguation of TDOA estimates in multi-path multi-source environments (DATEMM),” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), Toulouse, France, 2006.Google Scholar
- 15.J. Benesty, “Adaptive eigenvalue decomposition algorithm for passive acoustic source localization,” J. Acoust. Soc. Am., vol. 107, pp. 384-391, Jan. 2000.CrossRefGoogle Scholar
- 16.J. Chen, Y. Huang, and J. Benesty, “Time delay estimation” in Y. Huang and J. Benesty (eds.), Audio Signal Processing for Next-Generation Multimedia Communication Systems, Kluwer Academic Publishers, Boston, pp. 197-227, Feb. 2004.Google Scholar
- 17.A. Lombard, H. Buchner, and W. Kellermann, “Multidimensional localization of multiple sound sources using blind adaptive MIMO system identification,” in Proc. IEEE Int. Conf. Multisensor Fusion and Integration for Intelligent Systems (MFI), Heidelberg, Germany, Sept. 2006.Google Scholar
- 18.S. Haykin, Adaptive Filter Theory, 4th ed., Prentice-Hall, Englewood Cliffs, NJ, 2002.Google Scholar
- 19.A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis, Wiley & Sons, Inc., New York, 2001.CrossRefGoogle Scholar
- 20.S.C. Douglas, “Blind separation of acoustic signals” in M. Brandstein and D. Ward (eds.), Microphone Arrays: Signal Processing Techniques and Applications, pp. 355-380, Springer, Berlin, 2001.Google Scholar
- 21.J.-F. Cardoso and A. Souloumiac, “Blind beamforming for non gaussian sig-nals,” IEE Proceedings-F, vol. 140, no. 6, pp. 362-370, Dec. 1993.Google Scholar
- 22.S. Araki et al., “Equivalence between frequency-domain blind source separa-tion and frequency-domain adaptive beamforming,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Orlando, FL, USA, pp. 1785-1788, May 2002.Google Scholar
- 23.M. Miyoshi and Y. Kaneda, “Inverse filtering of room acoustics,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, no. 2, pp. 145-152, Feb. 1988.CrossRefGoogle Scholar
- 24.K. Furuya, “Noise reduction and dereverberation using correlation matrix based on the multiple-input/output inverse-filtering theorem (MINT),” in Proc. Int. Workshop Hands-Free Speech Communication (HSC), Kyoto, Japan, pp. 59-62, Apr. 2001.Google Scholar
- 25.M.I. Gürelli and C.L. Nikias, “EVAM: An eigenvector-based algorithm for mul-tichannel blind deconvolution of input colored signals,” IEEE Trans. Signal Process., vol. 43, no. 1, pp. 134-149, Jan. 1995.CrossRefGoogle Scholar
- 26.K. Furuya and Y. Kaneda, “Two-channel blind deconvolution of nonmini-mum phase FIR systems,” IEICE Trans. Fundamentals, vol. E80-A, no. 5, pp. 804-808, May 1997.Google Scholar
- 27.S. Amari et al.,“Multichannel blind deconvolution and equalization using the natural gradient,” in Proc. IEEE Int. Workshop Signal Processing Advances in Wireless Communications, pp. 101-107, 1997.Google Scholar
- 28.S. Choi et al., “Natural gradient learning with a nonholonomic constraint for blind deconvolution of multiple channels,” in Proc. Int. Symp. Independent Component Analysis Blind Source Separation (ICA), pp. 371-376, 1999.Google Scholar
- 29.B.W. Gillespie and L. Atlas, “Strategies for improving audible quality and speech recognition accuracy of reverberant speech,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Hongkong, China, Apr. 2003.Google Scholar
- 30.K. Matsuoka and S. Nakashima, “Minimal distortion principle for blind source separation,” in Proc. Int. Symp. Independent Component Analysis Blind Signal Separation (ICA), San Diego, CA, USA, Dec. 2001.Google Scholar
- 31.H. Sawada, R. Mukai, S. Araki, and S. Makino, “A robust and precise method for solving the permutation problem of frequency-domain blind source separa-tion,” IEEE Trans. Speech Audio Process., vol. 12, no. 8, Sept. 2004.Google Scholar
- 32.H. Liu, G. Xu, and L. Tong, “A deterministic approach to blind identification of multi-channel FIR systems,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Adelaide, Australia, Apr. 1994.Google Scholar
- 33.H.-C. Wu and J.C. Principe,“Simultaneous diagonalization in the frequency domain (SDIF) for source separation,” in Proc. Int. Symp. Independent Com-ponent Analysis Blind Signal Separation (ICA), pp. 245-250, 1999.Google Scholar
- 34.C.L. Fancourt and L. Parra, “The coherence function in blind source separa-tion of convolutive mixtures of non-stationary signals,” in Proc. Int. Workshop Neural Networks Signal Processing (NNSP), 2001, pp. 303-312.Google Scholar
- 35.T.M. Cover and J.A. Thomas, Elements of Information Theory, Wiley & Sons, New York, 1991.MATHCrossRefGoogle Scholar
- 36.R. Aichner, H. Buchner, F. Yan, and W. Kellermann, “A real-time blind source separation scheme and its application to reverberant and noisy acoustic envi-ronments,” Signal Processing, vol. 86, no. 6, pp.1260-1277, 2006.MATHCrossRefGoogle Scholar
- 37.M. Kawamoto, K. Matsuoka, and N. Ohnishi, “A method of blind separation for convolved non-stationary signals,” Neurocomputing, vol. 22, pp. 157-171, 1998.MATHCrossRefGoogle Scholar
- 38.R. Aichner, H. Buchner, and W. Kellermann, “Exploiting narrowband effi-ciency for broadband convolutive blind source separation,” EURASIP Journal on Applied Signal Processing, vol. 2007, pp. 1-9, Sept. 2006.Google Scholar
- 39.T. Nishikawa, H. Saruwatari, and K. Shikano, “Comparison of time-domain ICA, frequency-domain ICA and multistage ICA for blind source separation,” in Proc. European Signal Processing Conference (EUSIPCO), vol. 2, pp. 15-18, Sept. 2002.Google Scholar
- 40.K. Yao, “A representation theorem and its applications to spherically-invariant random processes,” IEEE Trans. Inform. Theor., vol. 19, no. 5, pp. 600-608, Sept. 1973.MATHCrossRefGoogle Scholar
- 41.J. Goldman, “Detection in the presence of spherically symmetric random vec-tors,” IEEE Trans. Inform. Theor., vol. 22, no. 1, pp. 52-59, Jan. 1976.MATHCrossRefGoogle Scholar
- 42.H. Brehm and W. Stammler, “Description and generation of spherically invari-ant speech-model signals,” Signal Processing, vol. 12, pp. 119-141, 1987.CrossRefGoogle Scholar
- 43.S. Araki et al., “The fundamental limitation of frequency-domain blind source separation for convolutive mixtures of speech,” IEEE Trans. Speech Audio Process., vol. 11, no. 2, pp. 109-116, Mar. 2003.CrossRefMathSciNetGoogle Scholar
- 44.H. Sawada et al., “Spectral smoothing for frequency-domain blind source sepa-ration,” in Proc. Int. Workshop Acoustic Echo and Noise Control (IWAENC), Kyoto, Japan, Sept. 2003, pp. 311-314.Google Scholar
- 45.M.Z. Ikram and D.R. Morgan, “Exploring permutation inconsistency in blind separation of speech signals in a reverberant environment,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Istanbul, Turkey, June 2000, vol. 2, pp. 1041-1044.Google Scholar
- 46.H. Buchner, R. Aichner, and W. Kellermann, “A generalization of a class of blind source separation algorithms for convolutive mixtures,” in Proc. Int. Symp. Independent Component Analysis Blind Signal Separation (ICA), Nara, Japan, Apr. 2003.Google Scholar
- 47.T. Kim, T. Eltoft, and T.-W. Lee, “Independent vector analysis: an extension of ICA to multivariate components,” in Proc. Int. Conf. Independent Component Analysis Blind Signal Separation (ICA), Mar. 2006.Google Scholar
- 48.A. Hiroe, “Solution of permutation problem in frequency domain ICA using multivariate probability density functions,” in Proc. Int. Conf. Independent Component Analysis Blind Signal Separation (ICA), pp. 601-608, Mar. 2006.Google Scholar
- 49.P. Smaragdis, “Blind separation of convolved mixtures in the frequency domain,” Neurocomputing, vol. 22, pp. 21-34, July 1998.MATHCrossRefGoogle Scholar
- 50.D.H. Johnson and D.E. Dudgeon, Array Signal Processing, Prentice Hall, New Jersey, 1993.MATHGoogle Scholar
- 51.H. Wang and M. Kaveh, “Coherent Signal-Subspace Processing for the De-tection and Estimation of Angles of Arrival of Multiple Wide-Band Sources,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, no. 4, pp. 823-831, Aug. 1985.Google Scholar
- 52.H. Teutsch and W. Kellermann, “Acoustic source detection and localization based on wavefield decomposition using circular microphone arrays,” J. Acoust. Soc. Am., vol. 120, no. 5, Nov. 2006.Google Scholar
- 53.W.R. Hahn and S.A. Tretter, “Optimum processing for delay-vector estimation in passive signal arrays,” IEEE Trans. Inform. Theory, vol. IT-19, pp. 608-614, May 1973.CrossRefGoogle Scholar
- 54.M. Wax and T. Kailath, “Optimum localization of multiple sources by passive arrays,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, no. 5, pp. 1210-1218, Oct. 1983.CrossRefMathSciNetGoogle Scholar
- 55.P.E. Stoica and A. Nehorai, “MUSIC, maximum likelihood and Cramer-Rao bound,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 720-740, May 1989.MATHCrossRefMathSciNetGoogle Scholar
- 56.J.C. Chen, R.E. Hudson, and K. Yao, “Maximum-likelihood source localization and unknown sensor location estimation for wideband signals in the near-field,” IEEE Trans. Signal Process., vol. 50, pp. 1843-1854, Aug. 2002.CrossRefGoogle Scholar
- 57.Y. Bard, Nonlinear Parameter Estimation, Academic Press, New York, 1974.MATHGoogle Scholar
- 58.W.H. Foy, “Position-location solutions by Taylor-series estimation,” IEEE Trans. Aerosp. Electron. Syst., vol. AES-12, pp. 187-194, Mar. 1976.CrossRefGoogle Scholar
- 59.R.O. Schmidt, “A new approach to geometry of range difference location,” IEEE Trans. Aerosp. Electron., vol. AES-8, pp. 821-835, Nov. 1972.CrossRefGoogle Scholar
- 60.H.C. Schau and A.Z. Robinson, “Passive source localization employing intersecting spherical surfaces from time-of-arrival differences,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, no. 8, pp. 1223-1225, Aug. 1987.CrossRefGoogle Scholar
- 61.J.O. Smith and J.S. Abel, “Closed-form least-squares source location estima-tion from range-difference measurements,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, no. 12, pp. 1661-1669, Dec. 1987.CrossRefGoogle Scholar
- 62.Y.T. Chan and K.C. Ho, “A simple and efficient estimator for hyperbolic loca-tion,” IEEE Trans. Signal Process., vol. 42, no. 8, pp. 1905-1915, Aug. 1994.CrossRefMathSciNetGoogle Scholar
- 63.Y.T. Chan and K.C. Ho, “An efficient closed-form localization solution from time difference of arrival measurements,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), 1994, vol. 2, pp. 393-396.Google Scholar
- 64.Y. Huang, J. Benesty, G.W. Elko, and R.M. Mersereau, “Real-time pas-sice source localization: an unbiased linear-correction least-squares approach,” IEEE Trans. Speech Audio Process., vol. 9, no. 8, pp. 943-956, Nov. 2001.CrossRefGoogle Scholar
- 65.J.S. Abel and J.O. Smith, “The spherical interpolation method for closed-form passive source localization using range difference measurements,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 1, pp. 471-474, 1987.Google Scholar
- 66.T.I. Laakso et al., “Splitting the unit delay,” IEEE Signal Processing Mag., vol. 13, pp. 30-60, 1996.CrossRefGoogle Scholar
- 67.M.S. Brandstein and H.F. Silverman, “A robust method for speech signal time-delay estimation in reverberant rooms,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Munich, Apr. 1997.Google Scholar
- 68.A. Stéphenne and B. Champagne, “A new cepstral prefiltering technique for estimating time delay under reverberant conditions,” Signal Processing, vol. 59, pp. 253-266, 1997.MATHCrossRefGoogle Scholar
- 69.R. Aichner, H. Buchner, S. Wehr, and W. Kellermann, “Robustness of acoustic multiple-source localization in adverse environments,” in Proc. ITG Fachtagung Sprachkommunication, Kiel, Germany, Apr. 2006.Google Scholar
- 70.M. Krinidis et al., “An audio-visual database for evaluating person tracking algorithms,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Philadelphia, PA, USA, Mar. 2005.Google Scholar