Acoustic Source Localization by Combination of Supervised Direction-of-Arrival Estimation with Disjoint Component Analysis

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10169)


Analysis and processing in reverberant, multi-source acoustic environments encompasses a multitude of techniques that estimate from sensor signals a spatially resolved “image” of acoustic space, a high-level representation of physical sources that consolidates several source components into a single sound object, and the estimation of filter parameters that would permit enhancement of target and attenuation of interfering signal components.

The contribution of the present manuscript is the introduction of a combination of different algorithms from the field of supervised learning, unsupervised subspace decomposition and multi-channel signal enhancement to accomplish these goals.

Specifically, we propose a system that (1) uses a bank of trained support vector machine classifiers to estimate source activity probability for each spatial position and (2) employs disjoint component analysis (DCA) to obtain from this probabilistic spatial source activity map those components that pertain to individual sound objects. We conclude with a brief outline for (3) estimation of multi-channel filter parameters based on DCA components in order to perform target source enhancement.

We illustrate the proposed method with decomposition results obtained with a four-channel hearing aid geometry setup that comprises two localized sources plus isotropic background noise in an anechoic environment.


Independent Component Analysis Independent Component Analysis Minimum Variance Distortionless Response Independent Component Analysis Component Sound Object 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



Supported by DFG grants SFB/TRR 31 “The Active Auditory System” and FOR 1732 “Individualized Hearing Acoustics”.


  1. 1.
    Amari, S.I.: Natural gradient works efficiently in learning. Neural Comput. 10, 251–276 (1998)CrossRefGoogle Scholar
  2. 2.
    Anemüller, J.: Maximization of component disjointness: a criterion for blind source separation. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 325–332. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-74494-8_41 CrossRefGoogle Scholar
  3. 3.
    Bell, A., Sejnowski, T.: An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159 (1995)CrossRefGoogle Scholar
  4. 4.
    Dreschler, W.a., Verschuure, H., Ludvigsen, C., Westermann, S.: ICRA noises: artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment. Audiology 40(3), 148–157 (2001)Google Scholar
  5. 5.
    Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: TIMIT Acoustic-Phonetic Continuous Speech Corpus. CDROM (1993)Google Scholar
  6. 6.
    Kayser, H., Anemüller, J.: A discriminative learning approach to probabilistic acoustic source localization. In: Proceedings of IWAENC 2014 - International Workshop on Acoustic Echo and Noise Control, pp. 100–104 (2014)Google Scholar
  7. 7.
    Kayser, H., Ewert, S.D., Anemüller, J., Rohdenburg, T., Hohmann, V., Kollmeier, B.: Database of multichannel in-ear and behind-the-ear head-related and binaural room impulse responses. EURASIP J. Adv. Sig. Process. 2009(1), 1–10 (2009). ID 298605Google Scholar
  8. 8.
    Kayser, H., Hohmann, V., Ewert, S.D., Kollmeier, B., Anemüller, J.: Robust auditory localization using probabilistic inference and coherence-based weighting of interaural cues. J. Acoust. Soc. Am. 138(5), 2635–2648 (2015)CrossRefGoogle Scholar
  9. 9.
    Kayser, H., Moritz, N., Anemüller, J.: Probabilistic spatial filter estimation for signal enhancement in multi-channel automatic speech recognition. In: Proceedings of INTERSPEECH 2016 (2016)Google Scholar
  10. 10.
    Knapp, C., Carter, G.: The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Sig. Process. 24(4), 320–327 (1976)CrossRefGoogle Scholar
  11. 11.
    May, T., van de Par, S., Kohlrausch, A.: A probabilistic model for robust localization based on a binaural auditory front-end. IEEE Trans. Audio Speech Lang. Process. 19, 1–13 (2011)CrossRefGoogle Scholar
  12. 12.
    Woodruff, J., Wang, D.: Binaural localization of multiple sources in reverberant and noisy environments. IEEE Trans. Audio Speech Lang. Process. 20, 1913–1928 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Medical Physics Unit and Cluster of Excellence Hearing4all, Computational Audition GroupCarl von Ossietzky Universität OldenburgOldenburgGermany

Personalised recommendations