Advertisement

A Speech Stream Detection in Adverse Acoustic Environments Based on Cross Correlation Technique

  • Ru-bo Zhang
  • Tian Wu
  • Xue-yao Li
  • Dong Xu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4222)

Abstract

Speech signal detection is very important in many areas of speech signal process technology. In real environments, speech signal is usually corrupted by background noise, which greatly affects the performance of speech signal detection system. Correlation analysis is a waveform analysis method which is commonly used in time domain, and the similarity of two signals can be measured by using of the correlation function. This paper presents a new approach based on waveform track from cross correlation coefficients to detect speech signal in adverse acoustic environments. This approach firstly removes irrelevant signal so as to decrease the interference from noise by making use of computing cross correlation coefficients, and then decides whether contains speech signal or not according to the waveform track. Moreover, the performance of the algorithm is compared to the approach based on short-term energy and the approach based on spectrum-entropy in various noise conditions, and algorithm is quantified by using the probability of correct classification. The experiments show that the waveform from cross correlation coefficients is powerful in anti-interference, especially being robust to colored noise such as babble.

Keywords

Speech Recognition Speech Signal Automatic Speech Recognition Factory Noise Colored Noise 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Jia, C., Xu, B.: An Improved Entropy–based Endpoint Detection Algorithm. In: International Conference on Spoken Language Processing (ICSLP 2002), Taipei, pp. 285–288 (2002)Google Scholar
  2. 2.
    Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall PTR, New Jersey (1993)Google Scholar
  3. 3.
    Bullington, K., Fraser, I.M.: Engineering aspects of TASI. Bell System Technical Journal 38, 353–364 (1959)Google Scholar
  4. 4.
    Zhu, S.Q., Qiu, X.H.: Research on Endpoint Detection of Speech Signals. Computer Simulation 22, 214–216 (2005)Google Scholar
  5. 5.
    Chen, L., Zhang, X.W.: New Methods of Speech Segmentation and Enhancement Based on Fractal Dimension. Signal Processing Proceedings, 281–284 (2000)Google Scholar
  6. 6.
    Julien, P., Jean-Luc, R., Régine, A.O.: Robust speech / music classification in audio documents. In: Dans: International Conference on Spoken Language Processing (ICSLP 2002). Denver vol. 3, pp. 2005–2008 (2002)Google Scholar
  7. 7.
    Varga, A.P., Steeneken, H.J.M., Tomlinson, M., Jones, D.: The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition. DRA Speech Research Unit Technical Report (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ru-bo Zhang
    • 1
  • Tian Wu
    • 1
  • Xue-yao Li
    • 1
  • Dong Xu
    • 1
  1. 1.College of Computer Science and TechnologyHarbin Engineering UniversityHarbinChina

Personalised recommendations