A Speech Stream Detection in Adverse Acoustic Environments Based on Cross Correlation Technique
Speech signal detection is very important in many areas of speech signal process technology. In real environments, speech signal is usually corrupted by background noise, which greatly affects the performance of speech signal detection system. Correlation analysis is a waveform analysis method which is commonly used in time domain, and the similarity of two signals can be measured by using of the correlation function. This paper presents a new approach based on waveform track from cross correlation coefficients to detect speech signal in adverse acoustic environments. This approach firstly removes irrelevant signal so as to decrease the interference from noise by making use of computing cross correlation coefficients, and then decides whether contains speech signal or not according to the waveform track. Moreover, the performance of the algorithm is compared to the approach based on short-term energy and the approach based on spectrum-entropy in various noise conditions, and algorithm is quantified by using the probability of correct classification. The experiments show that the waveform from cross correlation coefficients is powerful in anti-interference, especially being robust to colored noise such as babble.
KeywordsSpeech Recognition Speech Signal Automatic Speech Recognition Factory Noise Colored Noise
Unable to display preview. Download preview PDF.
- 1.Jia, C., Xu, B.: An Improved Entropy–based Endpoint Detection Algorithm. In: International Conference on Spoken Language Processing (ICSLP 2002), Taipei, pp. 285–288 (2002)Google Scholar
- 2.Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall PTR, New Jersey (1993)Google Scholar
- 3.Bullington, K., Fraser, I.M.: Engineering aspects of TASI. Bell System Technical Journal 38, 353–364 (1959)Google Scholar
- 4.Zhu, S.Q., Qiu, X.H.: Research on Endpoint Detection of Speech Signals. Computer Simulation 22, 214–216 (2005)Google Scholar
- 5.Chen, L., Zhang, X.W.: New Methods of Speech Segmentation and Enhancement Based on Fractal Dimension. Signal Processing Proceedings, 281–284 (2000)Google Scholar
- 6.Julien, P., Jean-Luc, R., Régine, A.O.: Robust speech / music classification in audio documents. In: Dans: International Conference on Spoken Language Processing (ICSLP 2002). Denver vol. 3, pp. 2005–2008 (2002)Google Scholar
- 7.Varga, A.P., Steeneken, H.J.M., Tomlinson, M., Jones, D.: The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition. DRA Speech Research Unit Technical Report (1992)Google Scholar