Zero-Crossing-Based Feature Extraction for Voice Command Systems Using Neck-Microphones

  • Sang Kyoon Park
  • Rhee Man Kil
  • Young-Giu Jung
  • Mun-Sung Han
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4491)

Abstract

This paper presents zero-crossing-based feature extraction for the speech recognition using neck-microphones. One of the solutions in noise-robust speech recognition is using neck-microphones which are not affected by the environmental noises. However, neck-microphones distort the original voice signals significantly since they only capture the vibrations of vocal tracts. In this context, we consider a new method of enhancing speech features of neck-microphone signals using zero-crossings. Furthermore, for the improvement of zero-crossing features, we consider to use the statistics of two adjacent zero-crossing intervals, that is, the statistics of two samples referred to as the second order statistics. Through the simulation for speech recognition using the neck-microphone voice command system, we have shown that the suggested method provides the better performance than other approaches using conventional speech features.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kay, S., Sudhaker, R.: A Zero Crossing-Based Spectrum Analyzer. IEEE Transactions on Acoustics, Speech, and Signal Processing 34(1), 96–104 (1986)CrossRefGoogle Scholar
  2. 2.
    Sreenivas, T., Niederjohn, R.: Zero-Crossing Based Spectral Analysis and Svd Spectral Analysis for Formant Frequency Estimation in Noise. IEEE Transactions on Signal Processing 40(2), 282–293 (1992)CrossRefGoogle Scholar
  3. 3.
    Kim, D., Lee, S., Kil, R.M.: Auditory Processing of Speech Signals for Robust Speech Recognition in Real-World Noisy Environments. IEEE Transactions on Speech and Audio Processing 7(1), 55–69 (1999)CrossRefGoogle Scholar
  4. 4.
    Blachman, N.: Zero-Crossing Rate for the Sum of Two Sinusoids or a Signal Plus Noise. IEEE Transactions on Information Theory, 671–675 (1975)Google Scholar
  5. 5.
    Kedem, B.: Time series analysis by higher order crossings. IEEE Computer Society Press, Los Alamitos (1994)MATHGoogle Scholar
  6. 6.
    Haralick, R.M., Shanmugam, K., Dinstein, I.: Texture Features for Image Classification. IEEE Transactions on Systems, Man and Cybernetics 3(6), 610–621 (1973)CrossRefGoogle Scholar
  7. 7.
    Davis, L.S., Johns, S.A., Aggarwal, J.K.: Texture Analysis Using Generalized Co-Occurrence Matrices. IEEE Transactions on Pattern Recognition and Machine Intelligence 1(3), 251–259 (1979)CrossRefGoogle Scholar
  8. 8.
    Clausi, D.A., Jernigan, M.E.: A Fast Method to Determine Cooccurrence Texture Features Using a Linked List Implementation. Remote Sensing of Environment, 506–509 (1996)Google Scholar
  9. 9.
    Clausi, D.A., Zhao, Y.: Rapid Extraction of Image Texture by Co-Occurrence Using a Hybrid Data Structure. Computers and Geosciences 28(6), 763–774 (2002)CrossRefGoogle Scholar
  10. 10.
    Hermansky, H.: Rasta Processing of Speech. IEEE Transactions on Speech and Audio Processing 2(4), 578–589 (1994)CrossRefGoogle Scholar
  11. 11.
    Ghulam, M., Fukuda, T., Horikawa, J., Nitta, T.: A Noise-Robust Feature-Extraction Method Based on Pitch-Synchronous Zcpa for Asr. In: Proc. of INTERSPEECH-ICSLP, vol. 1, pp. 133–136 (2004)Google Scholar
  12. 12.
    Hanazawa, T., Hinton, G., Shikano, K., Waibel, A., Lang, K.: Phonem Recognition Using Time Delay Neural Networks. IEEE Transactions on Acoustics, Speech, and Signal Processing 37(1), 328–339 (1989)Google Scholar
  13. 13.
    Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: HTK Book. Microsoft Corporation (2000)Google Scholar
  14. 14.
    Rabiner, L.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  15. 15.
    Rabiner, L., Sambur, M.: An Algorithm for Determining the Endpoints of Isolated Utterances. The Bell System Technical Journal 54(2), 297–315 (1975)CrossRefGoogle Scholar
  16. 16.
    Savoji, M.H.: Endpointing of Speech Signals. Speech Communication 8(1), 46–60 (1989)CrossRefGoogle Scholar
  17. 17.
    Mak, B., Junqua, J., Reaves, B.: A Robust Algorithm for Word Boundary Detection in the Presence of Noise. IEEE Transactions on Speech and Audio Processing 2(3), 406–412 (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Sang Kyoon Park
    • 1
  • Rhee Man Kil
    • 1
  • Young-Giu Jung
    • 2
  • Mun-Sung Han
    • 2
  1. 1.Division of Applied Mathematics, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701Korea
  2. 2.Smart Interface Research Team, Electronics and Telecommunications Research Institute, 161 Gajeong-dong, Yuseong-gu, Daejeon 305-700Korea

Personalised recommendations