Signal Preprocessing for Speech Recognition

Kolokolov, A. S.

doi:10.1023/A:1014714820229

Signal Preprocessing for Speech Recognition

Published: March 2002

Volume 63, pages 494–501, (2002)
Cite this article

Automation and Remote Control Aims and scope Submit manuscript

A. S. Kolokolov¹

431 Accesses
6 Citations
Explore all metrics

Abstract

Consideration was given to the transformations of speech in the frequency domain which precede extraction of the informative attributes of phonemes. A processing of the speech spectrum ensuring stability of recognition in the presence of frequency distortions and additive noise was proposed. It is based on linear bandpass filtering of the logarithmic amplitude spectrum and subsequent nonlinear transformation that models the effect of lateral inhibition in the auditory analyzer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Automatic speech recognition: a survey

Article 10 November 2020

Chinese dialect speech recognition: a comprehensive survey

Article Open access 31 January 2024

REFERENCES

Picone, J.W., Signal Modeling Techniques in Speech Recognition, Proc. IEEE, 1993, vol. 81, no. 9, pp. 1215–1247.
Google Scholar
Fant, G., Acoustic Theory of Speech Perception, Hague: Mouton, 1960. Translated under the title Akusticheskaya teoriya recheobrazovaniya, Moscow: Nauka, 1964.
Google Scholar
Flanagan, J.L., Speech Analysis, Synthesis and Perception, Berlin: Springer, 1965. Translated under the title Analiz, sintez i vospriyatie rechi, Moscow: Svyaz', 1968.
Google Scholar
Stevens, K.N., Acoustic Correlates of Some Phonetic Categories, J. Acoust. Soc. Am., 1980, vol. 68, no. 3, pp. 836–842.
Google Scholar
Chistovich, L.A., Ventsov, A.V., Granstrem, M.P., et al., Physiology of Speech. Human Perception, in Rukovodstvo po fiziologii (Manual on Physiology), Leningrad: Nauka, 1976.
Google Scholar
Zwicker, E. and Terhardt, E., Analytical Expressions for Critical-Band Rate and Critical Bandwidth as a Function of Frequency, J. Acoust. Soc. Am., 1980, vol. 68, no. 5, pp. 1523–1525.
Google Scholar
Traunmüller, H., Analytical Expressions for the Tonotopic Sensory Scale, J. Acoust. Soc. Am., 1990, vol. 88, no. 1, pp. 97–100.
Google Scholar
Varshavskii, L.A. and Chistovich, L.A., Mean Spectra of the Russian Vowel Phoneme, Probl. Phyziol. Akust., 1959, vol. IV, pp. 181–186.
Pirogov, A.A., On Phonetic Speech Coding, Elektrosvyaz', 1967, no. 5, pp. 24–31.
Kolokolov, A.S. and Yakhno, V.P., Speaker-Independent Recognition of Isolated Voice Commands on the Basis of Auditory Models, Avtom. Telemekh., 1995, no. 8, pp. 150–157.
Sachs, M.B. and Kiang, N.Y.S., Two-Tone Inhibition in Auditory Nerve Fibers, J. Acoust. Soc. Am., 1968, vol. 43, pp. 1120–1128.
Google Scholar
Lyubinskii, I.A., Pozin, N.V., and Yakhno, V.P., Analysis of the Models of Uniform Neural Layer with Lateral Connections, Avtom. Telemekh., 1967, no. 10, pp. 168–181.
Kolokolov, A.S., Ob odnom metode analiza periodicheskikh signalov, iskazhennykh additivnym shumom (On a Method of Analysis of Periodic Signals in Additive Noise), Available from VINITI, 1983, Moscow, no. 6253–83.
Kolokolov, A.S., Lyubinskii, I.A., and Yakhno, V.P., Improving the Signal-to-Noise Ratio by Nonlinear Filtering of the Amplitude Spectrum, 14th All-Union Symp. on Hydroacoustics, Minsk, 1986, pp. 107–109.
Childers, D.G., Skinner, D.P., and Kemerait, R.C., The Cepstrum: A Guide to Processing, Proc. IEEE, 1977, vol. 65, no. 10, pp. 1428–1443.
Google Scholar
Juang, B.H., Rabiner, L.R., and Wilpon, J.G., On the Use of Bandpass Liftering in Speech Recognition, IEEE Trans. Acoust., Speech, Signal Proc., 1987, vol. 35, no. 7, pp. 947–954.
Google Scholar

Download references

Author information

Authors and Affiliations

Trapeznikov Institute of Control Sciences, Russian Academy of Sciences, Moscow, Russia
A. S. Kolokolov

Authors

A. S. Kolokolov
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kolokolov, A.S. Signal Preprocessing for Speech Recognition. Automation and Remote Control 63, 494–501 (2002). https://doi.org/10.1023/A:1014714820229

Download citation

Issue Date: March 2002
DOI: https://doi.org/10.1023/A:1014714820229

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Signal Preprocessing for Speech Recognition

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Automatic speech recognition: a survey

Chinese dialect speech recognition: a comprehensive survey

REFERENCES

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Signal Preprocessing for Speech Recognition

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Automatic speech recognition: a survey

Chinese dialect speech recognition: a comprehensive survey

REFERENCES

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation