Current Problems in Automatic Speech Recognition

Ainsworth, W. A.; Green, P. D.

doi:10.1007/978-1-4613-4154-3_14

W. A. Ainsworth² &
P. D. Green³

252 Accesses

Abstract

Automatic speech recognition may be defined as any process which decodes the acoustic signal produced by the human voice into a sequence of linguistic units which contain the message that the speaker wishes to convey. At one extreme this includes the “phonetic typewriter,” a hypothetical device which types any words spoken into it, and at the other, “speech understanding systems” which extract the intended meaning from the sounds and carry out some appropriate action such as replying to a question or controlling a robot. During the last two decades the emphasis in research in automatic speech recognition has gradually shifted from the former type of device to the latter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ackroyd, M. H., 1974, Commercial applications of speech recognition, IEE Colloquium on Speech Synthesis and Recognition, Digest No. 1974/9, p. 7.
Google Scholar
Ainsworth, W. A., 1972, Duration as a cue in the recognition of vowels, J. Acoust. Soc. Am. 51: 648–651.
Article Google Scholar
Ainsworth, W. A., 1973, Intrinsic and extrinsic factors in vowel judgements, Auditory Analysis and Speech Perception, Academic Press, London.
Google Scholar
Barnett, J., 1973, A vocal data management system IEEE Trans. Audio Electroacoust. AU-21: 185–188.
Article Google Scholar
Bates, M., 1974, The use of syntax in a speech understanding system, IEEE Symp. Speech Recognition: 226–233.
Google Scholar
Bezdel, W., and Chandler, H. J., 1965, Results of analysis and recognition of vowels by computer using zero-crossing data, Proc. IEEE 112: 2060.
Google Scholar
Broadbent, D. E., and Ladefoged, P., 1960, Vowel judgements and adaptation level, Proc. Royal Soc. B 151: 384–399.
Article CAS Google Scholar
Davis, K. H., Biddulph, R., and Balashek, H., 1952, Automatic recognition of spoken digits, J. Acoust. Soc. Am. 24: 637–642.
Article Google Scholar
Dudley, H., and Balashek, S., 1958, Automatic recognition of phonetic patterns in speech, J. Acoust. Soc. Am. 30: 721–732.
Article Google Scholar
Fant, C. G. M., 1960, Acoustic Theory of Speech Production, Mouton, s’Gravenhage.
Google Scholar
Forgie, J. W., and Forgie, C. D., 1959, Results obtained from a vowel recognition computer program, J. Acoust. Soc. Am. 31: 1480–1489.
Article Google Scholar
Fry, D. B., and Denes, P., 1958, The solution of some fundamental problems in mechanical speech recognition, Language and Speech 1: 35–58.
Google Scholar
Fujisaki, H., and Kawashima, T., 1968, The roles of pitch and higher formants in the perception of vowels, IEEE Trans. Audio Electroacoust. AU-16: 73–77.
Article Google Scholar
Gerstman, L. J., 1968, Classification of self-normalized vowels, IEEE Trans. Audio Electroacoust. AU-16: 78–80.
Article Google Scholar
Green, P. D., 1971, Temporal characteristics of spoken consonants as discriminants in automatic speech recognition, Ph.D. Thesis, University of Keele.
Google Scholar
Green, P. D., and Ainsworth, W. A., 1973, Towards the automatic recognition of spoken Basic English, Machine Perception of Patterns and Pictures, Inst. of Physics Conf. Series No. 13, p. 161–168.
Google Scholar
Gregory, R. L., 1970, The Intelligent Eye, Weidenfeld and Nicolson, London.
Google Scholar
Halle, M., and Stevens, K. N., 1962, Speech recognition: a model and a program for research, IRE Trans. Information Theory IT-8: 155–159.
Article Google Scholar
Jakobson, R., Fant, C. G. M., and Halle, M., 1952, Preliminaries to Speech Analysis, MIT Tech. Report No. 13.
Google Scholar
Klatt, D. H., and Stevens, K. N., 1973, On the automatic recognition of continuous speech, IEEE Trans. Audio Electroacoust. AU-21: 210–217.
Article Google Scholar
Lavington, S. H., 1968, Measurement systems for automatic speech recognition, Ph.D. Thesis, University of Manchester.
Google Scholar
Lawrence, W., (1953), The synthesis of speech from signals which have a low information rate, Communication Theory (W. Jackson, ed.), Butterworths, London, 460–469.
Google Scholar
Lea, W. A., Medress, M. F., and Skinner, T. E., 1974, A prosodically-guided speech understanding strategy, IEEE Symp. Speech Recognition, 38–44.
Google Scholar
Lesser, V. R., Fennel, R. D., Erman, L. D. and Reddy, D. R., 1974, Organization of the HEARSAY II speech understanding system, IEEE Symp. Speech Recognition, 11–21.
Google Scholar
Licklider, J. C. R., and Pollack, I., 1948, Effects of differentiation, integration, and infinite peak dipping on the intelligibility of speech, J. Acoust. Soc. Am. 20: 42–51.
Article Google Scholar
Lindblom, B. E. F., and Studdert-Kennedy, M., 1967, On the role of formant transitions in vowel recognition, J. Acoust. Soc. Am. 42: 830–843.
Article CAS Google Scholar
MacKay, D. M., 1952, Mentality in machines, Proc. Aristot. Soc. Suppt., 26: 61–86.
Google Scholar
MacKay, D. M., 1967, Ways of looking at perception, Models for the Perception of Speech and Visual Form (W. Wathen-Dunn, ed.), MIT Press, Boston, 25–43.
Google Scholar
Nash-Webber, B., 1974, Semantic support for a speech understanding system, IEEE Symp. Speech Recognition, 244–249.
Google Scholar
Nelson, A. L., Werscher, M. B., Martin, T. B., Zadell, H. J., and Falter, J. W., 1967, Acoustic recognition by analog feature-abstraction techniques’ Models for Perception of Speech and Visual Form, (W. Wathen-Dunn, ed.), MIT Press, Boston, 428–439.
Google Scholar
Newell, A., Barnett, J., Forgie, J. W., Green, C, Klatt, D., Licklider, J. C. R., Munson, J., Reddy, D. R., and Woods, W. A., 1973, Speech Understanding System North-Holland Publishing Co.
Google Scholar
Öhman, S. E. G., 1966, Perception of segments of VCCV utterances, J. Acoust. Soc. Am., 40: 978–988.
Article Google Scholar
Oshika, B. T., Zue, V. W., Weeks, R. V., Neu, H., and Aurbach, J., 1974, The role of phonological rules in speech understanding research, IEEE Symp. Speech Recognition, 204–207.
Google Scholar
Paul, J. E., and Rabinowitz, A. S., 1974, An acoustically based continuous speech recognition system, IEEE Symp. Speech Recognition, 63–67.
Google Scholar
Paxton, W. H., 1974, A best-first parser, IEEE Symp. Speech Recognition, 218–225.
Google Scholar
Peterson, G. E., and Barney, H. L., 1952, Control methods used in a study of the vowels, J. Acoust. Soc. Am. 24: 175–184.
Article Google Scholar
Pollack, I., and Pickett, J., 1964, The intelligibility of excerpts from conversation, Language and Speech 6, 165–171.
Google Scholar
Potter, R. K., Kopp, G. A., and Green, H. C, 1947, Visible Speech, van Nostrand, New York.
Google Scholar
Purton, R. F., 1968, Speech recognition using autocorrelation analysis, IEEE Trans. Audio Electroacoust. AU-16: 235–239.
Article Google Scholar
Reddy, D. R., 1967, Computer recognition of connected speech, J. Acoust. Soc. Am., 44: 329–347.
Article Google Scholar
Reddy, D. R., Erman, L. D., and Neely, R. B., 1973, A model and a system for machine recognition of speech, IEEE Trans. Audio Electroacoust. AU-21: 229–238.
Article Google Scholar
Ritea, H. B., 1974, A voice-controlled data management system, IEEE Symp. Speech Recognition, 28–31.
Google Scholar
Rovner, P., Nash-Webber, R., and Words, W. A., 1974, Control concepts in a speech understanding system, IEEE Symp. Speech Recognition, 267–272.
Google Scholar
Sakai, T., and Doshita, S., 1962, The phonetic typewriter, Proc. IFIP Congress, Munich.
Google Scholar
Tappert, C. C., 1974, Experiments with a tree search method for converting noisy phonetic representation into standard orthography, IEEE Symp. Speech Recognition, pp. 261–266.
Google Scholar
Tappert, C. C., Dixon, N. R., and Rabinowitz, A. S., 1973, Application of sequential decoding for converting phonetic to graphic representation in automatic recognition of continuous speech (ARCS), IEEE Trans. Audio Electroacoust. AU-21: 225–229.
Article Google Scholar
Teacher, C. F., Kellett, H., and Focht, L., 1967, Experimental, limited vocabulary, speech recognizer, IEEE Intern. Conv. Record (Part III), 169–173.
Google Scholar
Walker, D. E., 1974, The SRI speech understanding system, IEEE Symp. Speech Recognition, pp.32–37.
Google Scholar
Winograd, T., 1972, Understanding Natural Language, Edinburgh University Press, Edinburgh.
Google Scholar
Winston, P. H., 1972, The MIT robot, Machine Intelligence 7: 431–463.
Google Scholar
Wiren, J., and Stubbs, H. L., 1956, Electronic binary selection system for phoneme classification, J. Acoust. Soc. Am. 28: 1082–1091.
Article Google Scholar
Woods, W. A., 1974, Motivation and overview of BBN SPEECHLIS, an experimental prototype for speech understanding research, IEEE Symp. Speech Recognition, pp. 1–10.
Google Scholar
Woods, W. A., and Makhoul, J., 1974, Mechanical inference problems in continuous speech understanding, Artific. Intell. 5: 73.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Communication, University of Keele, Keele, Staffordshire, England
W. A. Ainsworth
Department of Computing, North Staffordshire Polytechnic, Stafford, England
P. D. Green

Authors

W. A. Ainsworth
View author publications
You can also search for this author in PubMed Google Scholar
P. D. Green
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Southampton, England
Bruce G. Batchelor

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ainsworth, W.A., Green, P.D. (1978). Current Problems in Automatic Speech Recognition. In: Batchelor, B.G. (eds) Pattern Recognition. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-4154-3_14

Download citation

DOI: https://doi.org/10.1007/978-1-4613-4154-3_14
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-4156-7
Online ISBN: 978-1-4613-4154-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics