Underdetermined Blind Source Separation Using Linear Separation System
In automatic speech and speech emotion recognition, a good quality of input speech signal is often required. The hit rate of recognizers is lowered by degradation of speech quality due to noise. Blind source separation can be used to enhance the speech signal as a part of preprocessing techniques. This paper presents a multi channel linear blind source separation method that can be applied even in underdetermined case i.e. when the number of source signals is higher than the number of sensors. Experiments have shown that our system outperforms conventional time-frequency binary masking in both determined and underdetermined cases and significantly increases the hit rate of speech recognizers.
Keywordsarray signal processing beamforming blind source separation speech processing time-frequency binary masking
Unable to display preview. Download preview PDF.
- 2.Johnson, D., Dungeon, D.: Array Signal Processing. Prentice Hall, Englewood Cliffs (1993)Google Scholar
- 3.Yilmaz, O., Rickard, S.: Blind Separation of Speech Mixtures via Time-Frequency Masking. IEEE Transactions on Signal Processing 52(7) (2004)Google Scholar
- 4.Cermak, J., Araki, S., Sawada, H., Makino, S.: Blind Source Separation Based on a Beamformer Array and Time-Frequency Binary Masking. In: ICASSP 2007, vol. 1, pp. 145–148 (2007) ISBN 1–4244–0728–1Google Scholar
- 5.Perceptual Evaluation of Speech Quality (PESQ). ITU-T Recommendation P.862, http://www.itu.int/rec/T-REC-p
- 7.Methods for Subjective Determination of Transmission Quality. ITU-T Recommendation P.800, http://www.itu.int/rec/T-REC-p