Abstract
In this work, we present a linearly constrained signal extraction algorithm which is based on a Minimum Mutual Information (MMI) criterion that allows to exploit the three fundamental properties of speech and audio signals: Nonstationarity, Nonwhiteness, and Nongaussianity. Hence, the proposed method is very well suited for signal processing of nonstationary nongaussian broadband signals like speech. Furthermore, from the linearly constrained MMI approach, we derive an efficient realization in a (GSC) structure. To estimate the relative transfer functions between the microphones, which are needed for the set of linear constraints, we use an informed time-domain independent component analysis algorithm, which exploits some coarse direction-of-arrival information of the target source. As a decisive advantage, this simplifies the otherwise challenging control mechanism for simultaneous adaptation of the GSC’s blocking matrix und interference and noise canceler coefficients. Finally, we establish relations between the proposed method and other well-known multichannel linear filter approaches for signal extraction based on second-order-statistics, and demonstrate the effectiveness of the proposed signal extraction method in a multispeaker scenario.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The approach of minimizing the output power in the presence of linear constraints was originally presented by Frost in [14] for use with antenna arrays, assuming free-field propagation.
- 2.
Note that we chose this reverberation time in order to demonstrate the advantage of the HOS-based realization over the SOS-based realization of the MMI-based GSC.
Abbreviations
- ICA:
-
Independent Component Analysis
- BSS:
-
Blind Source Separation
- MWF:
-
Multichannel Wiener Filter
- LCMV:
-
Linearly Constrained Minimum Variance
- DOA:
-
Direction of Arrival
- RTF:
-
Relative Transfer Functions
- SOS:
-
Second Order Statistics
- MVDR:
-
Minimum Variance Distortionless Response
- FIR:
-
Finite Impulse Response
- AIR:
-
Acoustic Impulse Response
- GSC:
-
Generalized Sidelobe Canceler
- MMI:
-
Minimum Mutual Information
- STFT:
-
Short-Time Fourier Transform
- VAD:
-
Voice Activity Detection
- SPP:
-
Speech Presence Probability
- TRINICON:
-
TRIple-N Independent component analysis for CONvolutive mixtures
- SE:
-
Signal Extraction
- NRE:
-
Normalized RTF Estimation Error
- SIR:
-
Signal-to-Interference Ratio
References
W. Kellermann, H. Buchner, W. Herbordt, R. Aichner, Multichannel acoustic signal processing for human/machine interfaces—Fundamental problems and recent advances, in Proceedings of the International Conference on Acoustics (ICA), April 2004, pp. I–243–250
S. Doclo, W. Kellermann, S. Makino, S.E. Nordholm, Multichannel signal enhancement algorithms for assisted listening devices: Exploiting spatial diversity using multiple microphones. IEEE Signal Process. Mag. 32(2), 18–30 (2015)
S. Gannot, E. Vincent, S. Markovich-Golan, A. Ozerov, A consolidated perspective on multimicrophone speech enhancement and source separation. IEEE/ACM Trans. Audio, Speech Lang. Process. (ASLP). 25(4), 692–730 (2017)
B.D.V. Veen, K.M. Buckley, Beamforming: a versatile approach to spatial filtering. IEEE ASSP Mag. 5(2), 4–24 (1988)
A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis (Wiley, 2001)
P. Smaragdis, Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1–3), 21–34 (1998)
L. Parra, C. Spence, Convolutive blind separation of non-stationary sources. IEEE Trans. Audio Speech Lang. Process. (ASL) 8(3), 320–327 (2000)
H. Buchner, R. Aichner, W. Kellermann, Blind source separation for convolutive mixtures: a unified treatment, in Audio Signal Processing for Next-generation Multimedia Communication Systems, ed. by Y. Huang, J. Benesty (Kluwer Academic Publishers, 2004), pp. 255–293
S. Makino, T.-W. Lee, H. Sawada, Blind Speech Separation (Springer, 2007)
A. Ozerov, C. Fevotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. (ASL) 18(3), 550–563 (2010)
B. Widrow, P.E. Mantey, L.J. Griffiths, B.B. Goode, Adaptive antenna systems. Proc. IEEE 55(12), 2143–2159 (1967)
A. Spriet, M. Moonen, J. Wouters, Spatially pre-processed speech distortion weighted multi-channel Wiener filtering for noise reduction. Signal Process. (SP) 84(12), 2367–2387 (2004)
S. Doclo, A. Spriet, J. Wouters, M. Moonen, Frequency-domain criterion for the speech distortion weighted multichannel Wiener filter for robust noise reduction. Speech Commun. (SC) 49(78), 636–656 (2007)
O.L. Frost, An algorithm for linearly constrained adaptive array processing. Proc. IEEE 60(8), 926–935 (1972)
L. Griffiths, C. Jim, An alternative approach to linearly constrained adaptive beamforming. IEEE Trans. Antennas Propag. (AP) 30(1), 27–34 (1982)
H. Buchner, R. Aichner, W. Kellermann, Blind source separation algorithms for convolutive mixtures exploiting nongaussianity, nonwhiteness, and nonstationarity, in Proceedings of the International Workshop Acoustic Echo Noise Control (IWAENC), September 2003, pp. 275–278
W. Herbordt, H. Buchner, W. Kellermann, An acoustic human-machine front-end for multimedia applications. EURASIP J. Adv. Signal Process. 2003(1), 1–11 (2003)
W. Herbordt, Sound Capture for Human/Machine Interfaces—Practical Aspects of Microphone Array Signal Processing (Springer, Heidelberg, Germany, 2005)
P. Oak, W. Kellermann, A calibration method for robust generalized sidelobe cancelling beamformers, in Proceedings of the International Workshop Acoustic Echo Noise Control (IWAENC), September 2005, pp. 97–100
M. Souden, J. Chen, J. Benesty, S. Affes, An integrated solution for online multichannel noise tracking and reduction. IEEE Trans. Audio Speech Lang. Process. (ASL) 9(7), 2159–2169 (2011)
R.C. Hendriks, T. Gerkmann, Noise correlation matrix estimation for multi-microphone speech enhancement. IEEE Trans. Audio Speech Lang. Process. (ASL) 20(1), 223–233 (2012)
E.A.P. Habets, S. Gannot, Tutorial: Linear and parametric microphone array processing, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, Canada, https://www.audiolabs-erlangen.de/fau/professor/habets/activities/ICASSP-2013/
S. Gannot, D. Burshtein, E. Weinstein, Signal enhancement using beamforming and nonstationarity with applications to speech. IEEE Trans. Signal Process. (SP) 49(8), 1614–1626 (2001)
S. Gannot, I. Cohen, Speech enhancement based on the general transfer function gsc and postfiltering. IEEE Speech Audio Process. (SAP) 12(6), 561–571 (2004)
S. Markovich-Golan, S. Gannot, I. Cohen, Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals. IEEE Trans. Audio Speech Lang. Process. (ASL) 17(6), 1071–1086 (2009)
S.M. Golan, S. Gannot, I. Cohen, Subspace tracking of multiple sources and its application to speakers extraction, in Proceedings of the IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP), March 2010, pp. 201–204
L.C. Parra, C.V. Alvino, Geometric source separation: Merging convolutive source separation with geometric beamforming. IEEE Speech Audio Process. (SAP) 10(6), 352–362 (2002)
G. Reuven, S. Gannot, I. Cohen, Dual-source transfer-function generalized sidelobe canceller. IEEE Trans. Audio Speech Lang. Process. (ASL) 16(4), 711–727 (2008)
S. Gannot, D. Burshtein, E. Weinstein, Analysis of the power spectral deviation of the general transfer function GSC. IEEE Trans. Signal Process. (SP) 52(4), 1115–1120 (2004)
Y. Zheng, K. Reindl, W. Kellermann, BSS for improved interference estimation for blind speech signal extraction with two microphones, in Proceedings of the International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), Aruba, Dutch Antilles, December 2009, pp. 253–256
J. Benesty, J. Chen, Y. Huang, J. Dmochowski, On microphone-array beamforming from a MIMO acoustic signal processing perspective. IEEE Trans. Audio Speech Lang. Process. (ASL) 15(3), 1053–1065 (2007)
J. Chen, J. Benesty, Y. Huang, An acoustic MIMO framework for analyzing microphone-array beamforming, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, April 2007, pp. I–25–I–28
H. Van Trees, Detection, Estimation, and Modulation Theory, Optimum Array Processing. Estimation, and Modulation Theory Detection (Wiley, 2004)
K. Reindl, Multichannel Acoustic Signal Extraction for Reverberant Environments (Verlag Dr. Hut, München, 2015)
W. Herbordt, W. Kellermann, Adaptive beamforming for audio signal acquisition, in Adaptive Signal Processing - Applications to Real-World Problems, ed. by Y.H.J. Benesty (Springer, Berlin, Germany, 2003), pp. 155–194
G. Strang, Linear Algebra and Its Applications, 4th edn. (Thomson, Brooks/Cole, Belmont, CA, 2006)
K.M. Buckley, L.J. Griffiths, An adaptive generalized sidelobe canceller with derivative constraints. IEEE Trans. Antennas Propag. (AP) 34(3), 311–319 (1986)
K. Buckley, Broad-band beamforming and the generalized sidelobe canceller. IEEE Trans. Acoust. Speech Signal Process. (ASSP) 34(5), 1322–1323 (1986)
B.R. Breed, J. Strauss, A short proof of the equivalence of LCMV and GSC beamforming. IEEE Signal Process. Lett. (SPL) 9(6), 168–169 (2002)
O. Hoshuyama, A. Sugiyama, A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 1996, p. 925928
O. Hoshuyama, A. Sugiyama, A. Hirano, A robust adaptive microphone array with improved spatial selectivity and its evaluation in a real environment, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 1997, p. 367370
O. Hoshuyama, A. Sugiyama, A. Hirano, A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters. IEEE Trans. Signal Process. (SP) 47(10), 2677–2684 (1999)
W. Herbordt, W. Kellermann, Analysis of blocking matrices for generalized sidelobe cancellers for non-stationary broadband signals, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2002, pp. 4187–4187
W. Herbordt, H. Buchner, S. Nakamura, W. Kellermann, Application of a double-talk resilient DFT-domain adaptive filter for bin-wise stepsize controls to adaptive beamforming, in Proceedings of the International Workshop on Nonlinear Signal and Image Processing (NSIP), May 2005
W. Herbordt, H. Buchner, S. Nakamura, W. Kellermann, Outlier-robust DFT-domain adaptive filtering for bin-wise stepsize controls, and its application to a generalized sidelobe canceller, in Proceedings of the International Workshop Acoustic Echo Noise Control (IWAENC), September 2005, pp. 113–116
W. Herbordt, H. Buchner, S. Nakamura, W. Kellermann, Multichannel bin-wise robust frequency-domain adaptive filtering and its application to adaptive beamforming. IEEE Trans. Audio Speech Lang. Process. (ASL) 15(4), 1340–1351 (2007)
P.J. Huber, E.M. Ronchetti, Robust Statistics. Wiley Series in Probability and Statistics (Wiley, 2009)
K. Kumatani, T. Gehrig, U. Mayer, E. Stoimenov, J. McDonough, M. Wölfel, Adaptive beamforming with a minimum mutual information criterion. IEEE Trans. Audio Speech Lang. Process. (ASL) 15(8), 2527–2541 (2007)
K. Kumatani, J. McDonough, B. Rauch, D. Klakow, P.N. Garner, W. Li, Beamforming with a maximum negentropy criterion. IEEE Trans. Audio Speech Lang. Process. (ASL) 17(5), 994–1008 (2009)
G. Reuven, S. Gannot, I. Cohen, Performance analysis of dual source transfer-function generalized sidelobe canceller. Speech Commun. (SC) 49(78), 602–622 (2007)
S. Markovich, S. Gannot, I. Cohen, A comparison between alternative beamforming strategies for interference cancelation in noisy and reverberant environment, in IEEE Convention of Electrical and Electronics Engineers in Israel (IEEEI), December 2008, pp. 203–207
S. Markovich-Golan, S. Gannot, I. Cohen, A sparse blocking matrix for multiple constraints GSC beamformer, in Proceedings of the IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP), March 2012, pp. 197–200
S. Markovich-Golan, S. Gannot, I. Cohen, Distributed multiple constraints generalized sidelobe canceler for fully connected wireless acoustic sensor networks. IEEE Trans. Audio Speech Lang. Process. (ASL) 21(2), 343–356 (2013)
R. Talmon, I. Cohen, S. Gannot, Relative transfer function identification using convolutive transfer function approximation. IEEE Trans. Audio Speech Lang. Process. (ASL) 17(4), 546–555 (2009)
R. Talmon, I. Cohen, S. Gannot, Convolutive transfer function generalized sidelobe canceler. IEEE Trans. Audio Speech Lang. Process. (ASL) 17(7), 1420–1434 (2009)
O. Shalvi, E. Weinstein, System identification using nonstationary signals. IEEE Trans. Signal Process. (SP) 44(8), 2055–2063 (1996)
I. Cohen, Relative transfer function identification using speech signals. IEEE Speech Audio Process. (SAP) 12(5), 451–459 (2004)
M. Schwab, P. Noll, T. Sikora, Noise robust relative transfer function estimation,” in European Signal Processing Conference (EUSIPCO), September 2006, pp. 1–5
R. Talmon, I. Cohen, S. Gannot, Identification of the relative transfer function between microphones in reverberant environments, in IEEE Conventional of Electrical and Electronics Engineers in Israel (IEEEI), December 2008, pp. 208–212
R. Talmon, I. Cohen, S. Gannot, Identification of the Relative Transfer Function between Sensors in the Short-Time Fourier Transform Domain (Springer, Berlin, Heidelberg, 2010), pp. 33–47
A. Krueger, E. Warsitz, R. Haeb-Umbach, Speech enhancement with a GSC-like structure employing eigenvector-based transfer function ratios estimation. IEEE Trans. Audio Speech Lang. Process. (ASL) 19(1), 206–219 (2011)
X. Li, L. Girin, R. Horaud, S. Gannot, Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2015, pp. 320–324
T. Gerkmann, C. Breithaupt, R. Martin, Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors. IEEE Trans. Audio Speech Lang. Process. (ASL) 16(5), 910–919 (2008)
T. Gerkmann, M. Krawczyk, R. Martin, Speech presence probability estimation based on temporal cepstrum smoothing, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 2010, pp. 4254–4257
E. Weinstein, M. Feder, A.V. Oppenheim, Multi-channel signal separation by decorrelation. IEEE Speech Audio Process. (SAP) 1(4), 405–413 (1993)
K. Reindl, S. Markovich-Golan, H. Barfuss, S. Gannot, W. Kellermann, Geometrically constrained TRINICON-based relative transfer function estimation in underdetermined scenarios, in Proceedings of the IEEE Workshop Applications Signal Processing Audio Acoustics (WASPAA), October 2013
K. Reindl, W. Kellermann, Linearly-constrained multichannel interference suppression algorithms derived from a minimum mutual information criterion, in Proceedings of the IEEE China Summit & International Conference on Communication and Signal Processing (ChinaSIP 2013), July 2013
K. Reindl, S. Meier, H. Barfuss, W. Kellermann, Minimum mutual information-based linearly constrained broadband signal extraction. IEEE/ACM Trans. Audio Speech Lang. Process. (ASLP) 22(6), 1096–1108 (2014)
C. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27, 623–656 (1948)
R. Aichner, Acoustic blind source separation in reverberant and noisy environments. Ph.D. dissertation, University Erlangen-Nürnberg, Germany, May 2007
H. Buchner, Broadband adaptive MIMO filtering: a unified treatment and applications to acoustic human-machine interfaces. Ph.D. dissertation, University Erlangen-Nürnberg, Germany, 2010
H. Buchner, A systematic approach to incorporate deterministic prior knowledge in broadband adaptive MIMO systems, in Proceedings of the (Systems and Computers (ASILOMAR), Nov, Asilomar Conference Signals), 2010, pp. 461–468
S.I. Amari, Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998)
H. Buchner, R. Aichner, W. Kellermann, A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Speech Audio Process. (SAP) 13(1), 120–134 (2005)
S. Markovich-Golan, S. Gannot, W. Kellermann, Combined LCMV-TRINICON beamforming for separating multiple speech sources in noisy and reverberant environments. IEEE/ACM Trans. Audio Speech Lang. Process. (ASLP) 25(2), 320–332 (2017)
J. Chen, J. Benesty, Y. Huang, A minimum distortion noise reduction algorithm with multiple microphones. IEEE Trans. Audio Speech Lang. Process. (ASL) 16(3), 481–493 (2008)
E.A.P. Habets, J. Benesty, I. Cohen, S. Gannot, On a tradeoff between dereverberation and noise reduction using the MVDR beamformer, in Proceedings of the IEEE International Conference on Communication and Signal Processing (ICASSP), April 2009, p. 37413744
E.A.P. Habets, J. Benesty, I. Cohen, S. Gannot, J. Dmochowski, New insights into the MVDR beamformer in room acoustics. IEEE Trans. Audio Speech Lang. Process. (ASL) 18(1), 158170 (2010)
E.A.P. Habets, J. Benesty, S. Gannot, I. Cohen, The MVDR beamformer for speech enhancement, in Speech Processing in Modern Communication-Challenges and Perspectives, ed. by I. Cohen, J. Benesty, S. Gannot (Springer, Berlin, Germany, 2010), pp. 225–254
M. Knaak, S. Araki, S. Makino, Geometrically constraint ICA for convolutive mixtures of sound, in Proceedings of the IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP), vol. 2, April 2003, pp. II–725–728
M. Knaak, S. Araki, S. Makino, Geometrically constraint ICA for robust separation of sound mixtures, in Proceedings of the International Symposium on Independent Component Analysis Blind Separation (ICA), April 2003, pp. 951–956
M. Knaak, S. Araki, S. Makino, Geometrically constrained independent component analysis. IEEE Trans. Audio Speech Lang. Process. (ASL) 15(2), 715–726 (2007)
Y. Zheng, K. Reindl, W. Kellermann, Analysis of dual-channel ICA-based blocking matrix for improved noise estimation. EURASIP J. Adv. Signal Process. 2014(4:26), 1–24 (2014)
H. Brehm, W. Stammler, Description and generation of spherically invariant speech-model signals. Signal Process. (SP) 12, 119–141 (1987)
H. Buchner, W. Kellermann, A fundamental relation between blind and supervised adaptive filtering illustrated for blind source separation and acoustic echo cancellation, in Joint Workshop Hands-free Speech Communication, Microphone Arrays (HSCMA), May 2008, pp. 17–20
S. Haykin, Adaptive Filter Theory, 4th ed. (Prentice-Hall, 2002)
R. Aichner, H. Buchner, F. Yan, W. Kellermann, A real-time blind source separation scheme and its application to reverberant and noisy acoustic environments. Signal Process. (SP) 86, 1260–1277 (2006)
R. Aichner, H. Buchner, W. Kellermann, Exploiting narrowband efficiency for broadband convolutive blind source separation. EURASIP J. Adv. Signal Process. 2007, 1–9 (2006)
H. Barfuss, W. Kellermann, An adaptive microphone array topology for target signal extraction with humanoid robots, in Proceedings of the International Workshop Acoustic Signal Enhancement (IWAENC), September 2014, pp. 16–20
K. Kumatani, J. McDonough, B. Raj, Microphone array processing for distant speech recognition: from close-talking microphones to far-field sensors. IEEE Signal Process. Mag. 29(6), 127–140 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Barfuss, H., Reindl, K., Kellermann, W. (2018). Informed Spatial Filtering Based on Constrained Independent Component Analysis. In: Makino, S. (eds) Audio Source Separation. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-73031-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-73031-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73030-1
Online ISBN: 978-3-319-73031-8
eBook Packages: EngineeringEngineering (R0)