Skip to main content

K-means Based Underdetermined Blind Speech Separation

  • Chapter
Blind Speech Separation

This chapter addresses a blind sparse source separation method that can employ arbitrarily arranged multiple microphones. Some sparse source separation methods, which rely on source sparseness and an anechoic mixing model, have already been proposed. The validity of the sparseness and anechoic assumptions will be investigated in this chapter. As most of the existing methods utilize a stereo (two sensors) system, they limit the separation ability to a 2-dimensional half-plane. This chapter describes a method for multiple microphones. This method employs the k-means algorithm, which is an efficient clustering algorithm. The method can be easily applied to three or more sensors arranged nonlinearly. Promising results were obtained for 2- and 3-dimensionally distributed speech signals with nonlinear/nonuniform sensor arrays in a real room even in underdetermined situations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Haykin, Ed., Unsupervised Adaptive Filtering (Volume I: Blind Source Sep-aration). John Wiley & Sons, 2000.

    Google Scholar 

  2. A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis. John Wiley & Sons, 2001.

    Google Scholar 

  3. Ö . Yılmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Trans. on SP, vol. 52, no. 7, pp. 1830-1847, 2004.

    Article  Google Scholar 

  4. H. Buchner, R. Aichner, and W. Kellermann, “Blind source separation for con-volutive mixtures: A unified treatment,” in Audio Signal Processing for Next-Generation Multimedia Communication Systems, Y. Huang and J. Benesty, Eds. Kluwer Academic Publishers, Feb. 2004, pp. 255-293.

    Google Scholar 

  5. H. Sawada, R. Mukai, S. Araki, and S. Makino, “Frequency-domain blind source separation,” in Speech Enhancement, J. Benesty, S. Makino, and J. Chen, Eds. Springer, Mar. 2005, pp. 299-327.

    Google Scholar 

  6. S. Amari, S. Douglas, A. Cichocki, and H. Yang, “Multichannel blind decon-volution and equalization using the natural gradient,” in Proc. IEEE Workshop on Signal Processing Advances in Wireless Communications, Apr. 1997, pp. 101-104.

    Google Scholar 

  7. P. Smaragdis, “Blind separation of convolved mixtures in the frequency do-main,” Neurocomputing, vol. 22, pp. 21-34, 1998.

    Article  MATH  Google Scholar 

  8. L. Parra and C. Spence, “Convolutive blind separation of nonstationary sources,” IEEE Trans. Speech Audio Processing, vol. 8, no. 3, pp. 320-327, May 2000.

    Article  Google Scholar 

  9. J. Anemüller and B. Kollmeier, “Amplitude modulation decorrelation for con-volutive blind source separation,” in Proc. ICA 2000, June 2000, pp. 215-220.

    Google Scholar 

  10. S. Araki, R. Mukai, S. Makino, T. Nishikawa, and H. Saruwatari, “The funda-mental limitation of frequency domain blind source separation for convolutive mixtures of speech,” IEEE Trans. Speech Audio Processing, vol. 11, no. 2, pp. 109-116, 2003.

    Article  Google Scholar 

  11. F. Theis, E. Lang, and C. Puntonet, “A geometric algorithm for overcomplete linear ICA,” Neurocomputing, vol. 56, pp. 381-398, 2004.

    Article  Google Scholar 

  12. P. Bofill and M. Zibulevsky, “Blind separation of more sources than mixtures using sparsity of their short-time Fourier transform,” in Proc. ICA2000, 2000, pp. 87-92.

    Google Scholar 

  13. L. Vielva, D. Erdogmus, C. Pantaleon, I. Santamaria, J. Pereda, and J. C. Principe, “Underdetermined blind source separation in a time-varying environ-ment,” in Proc. ICASSP2002, 2002, pp. 3049-3052.

    Google Scholar 

  14. P. Bofill, “Underdetermined blind separation of delayed sound sources in the frequency domain,” Neurocomputing, vol. 55, pp. 627-641, 2003.

    Article  Google Scholar 

  15. A. Blin, S. Araki, and S. Makino, “Underdetermined blind separation of convo-lutive mixtures of speech using time-frequency mask and mixing matrix esti-mation,” IEICE Trans. Fundamentals, vol. E88-A, no. 7, pp. 1693-1700, 2005.

    Article  Google Scholar 

  16. S. Winter, W. Kellermann, H. Sawada, and S. Makino, “MAP-based underde-termined blind source separation of convolutive mixtures by hierarchical clus-tering and l1-norm minimization,” EURASIP Journal on Advances in Signal Processing, Article ID 24717, 2007.

    Google Scholar 

  17. J. M. Peterson and S. Kadambe, “A probabilistic approach for blind source separation of underdetermined convolutive mixtures,” in Proc. ICASSP 2003, vol. VI, 2003, pp. 581-584.

    Google Scholar 

  18. A. Jourjine, S. Rickard, and Ö . Yılmaz, “Blind separation of disjoint orthogonal signals: Demixing N sources from 2 mixtures,” in Proc. ICASSP2000, vol. 12, 2000, pp. 2985-2988.

    Google Scholar 

  19. M. Aoki, M. Okamoto, S. Aoki, H. Matsui, T. Sakurai, and Y. Kaneda, “Sound source segregation based on estimating incident angle of each frequency com-ponent of input signals acquired by multiple microphones,” Acoustical Science and Technology, vol. 22, no. 2, pp. 149-157, 2001.

    Article  Google Scholar 

  20. N. Roman, D. Wang, and G. J. Brown, “Speech segregation based on sound localization,” Journal of Acoustical Society of America, vol. 114, no. 4, pp. 2236-2252, Oct. 2003.

    Article  Google Scholar 

  21. S. Rickard, R. Balan, and J. Rosca, “Real-time time-frequency based blind source separation,” in Proc. ICA2001, Dec. 2001, pp. 651-656.

    Google Scholar 

  22. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. Wiley Interscience, 2000.

    Google Scholar 

  23. R. Balan, J. Rosca, and S. Rickard, “Non-square blind source separation un-der coherent noise by beamforming and time-frequency masking,” in Proc. ICA2003, Apr. 2003, pp. 313-318.

    Google Scholar 

  24. T. Melia, S. Rickard, and C. Fearon, “Histogram-based blind source separa-tion of more sources than sensors using a DUET-ESPRIT technique,” in Proc. EUSIPCO2005, Sept. 2005.

    Google Scholar 

  25. S. Araki, S. Makino, H. Sawada, and R. Mukai, “Reducing musical noise by a fine-shift overlap-add method applied to source separation using a time-frequency mask,” in Proc. ICASSP2005, vol. III, Mar. 2005, pp. 81-84.

    Google Scholar 

  26. J. Karvanen and A. Cichocki, “Measuring sparseness of noisy signals,” in Proc. ICA2003, Apr. 2003, pp. 125-130.

    Google Scholar 

  27. S. Rickard, “Sparse sources are separated sources,” in Proc. EUSIPCO2006, Sept. 2006.

    Google Scholar 

  28. S. Rickard and Ö . Yılmaz, “On the approximate W-disjoint orthogonality of speech,” in Proc. ICASSP2002, vol. I, May 2002, pp. 529-532.

    Google Scholar 

  29. Ö. Yılmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Trans. Signal Processing, vol. 52, no. 7, pp. 1830-1847, July 2004.

    Article  Google Scholar 

  30. S. Araki, H. Sawada, R. Mukai, and S. Makino, “Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors,” Signal Processing, doi:10.1016/j.sigpro.2007.02.003, 2007.

    Google Scholar 

  31. S. Araki, S. Makino, A. Blin, R. Mukai, and H. Sawada, “Underdetermined blind separation for speech in real environments with sparseness and ICA,” in Proc. ICASSP 2004, vol. III, May 2004, pp. 881-884.

    Google Scholar 

  32. ——, “A novel blind source separation method with observation vector clus-tering,” in Proc. 2005 International Workshop on Acoustic Echo and Noise Control (IWAENC 2005), Sept. 2005, pp. 117-120.

    Google Scholar 

  33. http://www.kecl.ntt.co.jp/icl/signal/araki/xcluster fine.html.”

  34. S. Araki, H. Sawada, R. Mukai, and S. Makino, “DOA estimation for mul-tiple sparse sources with normalized observation vector clustering,” in Proc. ICASSP2006, vol. 5, May 2006, pp. 33-36.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer

About this chapter

Cite this chapter

Araki, S., Sawada, H., Makino, S. (2007). K-means Based Underdetermined Blind Speech Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-4020-6479-1_9

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-6478-4

  • Online ISBN: 978-1-4020-6479-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics