A Latently Constrained Mixture Model for Audio Source Separation and Localization

  • Antoine Deleforge
  • Radu Horaud
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7191)


We present a method for audio source separation and localization from binaural recordings. The method combines a new generative probabilistic model with time-frequency masking. We suggest that device-dependent relationships between point-source positions and interaural spectral cues may be learnt in order to constrain a mixture model. This allows to capture subtle separation and localization features embedded in the auditory data. We illustrate our method with data composed of two and three mixed speech signals in the presence of reverberations. Using standard evaluation metrics, we compare our method with a recent binaural-based source separation-localization algorithm.


Sound Source Source Position Room Impulse Response Constrain Mixture Model Sound Intensity Level 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Celeux, G., Govaert, G.: A classification EM algorithm for clustering and two stochastic versions. Comp. Stat. & Data An. 14(3), 315–332 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Mandel, M.I., Weiss, R.J., Ellis, D.P.W.: Model-based expectation-maximization source separation and localization. IEEE TASLP 18, 382–394 (2010)Google Scholar
  3. 3.
    Mouba, J., Marchand, S.: A source localization/separation/respatialization system based on unsupervised classification of interaural cues. In: Proceedings of the International Conference on Digital Audio Effects, pp. 233–238 (2006)Google Scholar
  4. 4.
    Nix, J., Hohmann, V.: Sound source localization in real sound fields based on empirical statistics of interaural parameters. JASA 119(1), 463–479 (2006)CrossRefGoogle Scholar
  5. 5.
    Roman, N., Wang, D., Brown, G.J.: Speech segregation based on sound localization. JASA 114(4), 2236–2252 (2003)CrossRefGoogle Scholar
  6. 6.
    Vincent, E., Gribonval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE TASLP 14(4), 1462–1469 (2006)Google Scholar
  7. 7.
    Viste, H., Evangelista, G.: On the use of spatial cues to improve binaural source separation. In: Proc. Int. Conf. on Digital Audio Effects, pp. 209–213 (2003)Google Scholar
  8. 8.
    Yılmaz, O., Rickard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing 52, 1830–1847 (2004)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Antoine Deleforge
    • 1
  • Radu Horaud
    • 1
  1. 1.INRIA Grenoble Rhône-AlpesFrance

Personalised recommendations