Skip to main content

Cepstral Smoothing for Convolutive Blind Speech Separation

  • Conference paper
Computational Intelligence and Information Technology (CIIT 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 250))

  • 1594 Accesses

Abstract

In this work, we have proposed an approach which combines two source separation techniques, convolutive blind source separation (BSS) exploiting the second-order non-stationary signals and binary time-frequency masking, together with a cepstral smoothing post-processing. The latter consists in smoothing of the estimated binary masks from the outputs of BSS algorithm in cepstral domain. The idea behind employing a cepstral smoothing of spectral masks is to improve the interference suppression and to reduce musical noise typically produced by time-frequency masking. Experimental results and the evaluation measurement prove the performance of proposed convolutive blind speech separation system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Haykin, S., Chen, Z.: The cocktail party problem. Neural Computation 17, 1875–1902 (2005)

    Article  Google Scholar 

  2. Asari, H., Pearlmutter, B.A., Zador, A.M.: Sparse Representations for the Cocktail Party Problem. The Journal of Neuroscience 26(28), 7477–7490 (2006)

    Article  Google Scholar 

  3. Gorokhov, A., Loubaton, P.: Subspace based techniques for second order blind separation of convolutive mixtures with temporally correlated sources. IEEE Trans. on Circuit Systems I: Fundamental Theory and Applications 44(9), 813–820 (1997)

    Article  MathSciNet  Google Scholar 

  4. Douglas, S.C., Gupta, M., Sawada, H., Makino, S.: Spatio-temporal fastica algorithms for the blind separation of convolutive mixtures. IEEE Transactions on Audio Speech Lang. Processing. 15(5), 1511–1520 (2007)

    Article  Google Scholar 

  5. Parra, L., Spence, C.: Convolutive blind separation of non-stationary sources. IEEE Trans. on Speech and Audio Processing 8(3), 320–327 (2000)

    Article  MATH  Google Scholar 

  6. Makino, S., Sawada, H., Mukai, R., Araki, S.: Blind source separation of convolutive mixtures of speech in frequency domain. IEICE Trans. on Fundamentals of Electronics, Communications and Computer Sciences E88-A(7), 1640–1655 (2005)

    Article  Google Scholar 

  7. Vincent, E., Gribonval, R., Fevotte, C.: Performance Measurement in Blind Audio Source Separation. IEEE Trans. on Audio, Speech, and Language Processing 14(4), 1462–1469 (2006)

    Article  Google Scholar 

  8. Yellin, D., Weinstein, E.: Multichannel signal separation: methods and analysis. IEEE Trans. on Signal Processing 44, 106–118 (1996)

    Article  Google Scholar 

  9. Wang, D.L.: On ideal binary mask as the computational goal of auditory scene analysis. In: Speech Separation by Humans and Machines. Springer, Heidelberg (2005)

    Google Scholar 

  10. Pedersen, M.S., Larsen, J., Kjems, U., Parra, L.C.: A survey of convolutive blind source separation methods. In: Handbook of Speech Processing. Springer, Heidelberg (2007)

    Google Scholar 

  11. Wang, D.L., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Wiley-IEEE Press, Hoboken, New Jersey (2006)

    Book  Google Scholar 

  12. Oppenheim, A.V., Schafer, R.W.: Discrete Time Signal Processing, 3rd edn. Prentice Hall, New Jersey (2009)

    MATH  Google Scholar 

  13. Aichner, R., Buchner, H., Araki, S., Makino, S.: On-line time-domain blind source separation of non stationary convolved signals. In: 4th International Symposium on Independent Component Analysis and Blind Signal Separation, Japan, pp. 987–992 (2003)

    Google Scholar 

  14. Rahbar, K., Reilly, J.: Geometric optimization methods for blind source separation of signals. In: International Workshop on Independent Component Analysis and Signal Separation, Finland, pp. 375–380 (2000)

    Google Scholar 

  15. Chan, D., Rayner, P., Godsill, S.: Multi-channel signal separation. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Georgia, pp. 649–652 (1996)

    Google Scholar 

  16. Madhu, N., Breithaupt, C., Martin, R.: Temporal smoothing of spectral masks in the cepstral domain for speech separation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, pp. 45–48 (2008)

    Google Scholar 

  17. Jan, T., Wang, W., Wang, D.L.: A multistage approach for blind separation of convolutive speech mixtures. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Taiwan, pp. 1713–1716 (2009)

    Google Scholar 

  18. Pesquet, J., Chen, B., Petropulu, A.P.: Frequency domain contrast functions for separation of convolutive mixtures. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, pp. 2765–2768 (2001)

    Google Scholar 

  19. ITU-T P.862, Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, International Telecommunication Union, Geneva (2001)

    Google Scholar 

  20. Fevotte, C., Gribonval, R., Vincent, E.: BSS EVAL toolbox user guide. Technical Report 1706, IRISA (2005)

    Google Scholar 

  21. Gaubitch, N.D.: Allen and Berkeley image model for room impulse response, Imperial College London (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Missaoui, I., Lachiri, Z. (2011). Cepstral Smoothing for Convolutive Blind Speech Separation. In: Das, V.V., Thankachan, N. (eds) Computational Intelligence and Information Technology. CIIT 2011. Communications in Computer and Information Science, vol 250. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25734-6_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25734-6_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25733-9

  • Online ISBN: 978-3-642-25734-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics