Abstract
Automatic speech recognition may be defined as any process which decodes the acoustic signal produced by the human voice into a sequence of linguistic units which contain the message that the speaker wishes to convey. At one extreme this includes the “phonetic typewriter,” a hypothetical device which types any words spoken into it, and at the other, “speech understanding systems” which extract the intended meaning from the sounds and carry out some appropriate action such as replying to a question or controlling a robot. During the last two decades the emphasis in research in automatic speech recognition has gradually shifted from the former type of device to the latter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ackroyd, M. H., 1974, Commercial applications of speech recognition, IEE Colloquium on Speech Synthesis and Recognition, Digest No. 1974/9, p. 7.
Ainsworth, W. A., 1972, Duration as a cue in the recognition of vowels, J. Acoust. Soc. Am. 51: 648–651.
Ainsworth, W. A., 1973, Intrinsic and extrinsic factors in vowel judgements, Auditory Analysis and Speech Perception, Academic Press, London.
Barnett, J., 1973, A vocal data management system IEEE Trans. Audio Electroacoust. AU-21: 185–188.
Bates, M., 1974, The use of syntax in a speech understanding system, IEEE Symp. Speech Recognition: 226–233.
Bezdel, W., and Chandler, H. J., 1965, Results of analysis and recognition of vowels by computer using zero-crossing data, Proc. IEEE 112: 2060.
Broadbent, D. E., and Ladefoged, P., 1960, Vowel judgements and adaptation level, Proc. Royal Soc. B 151: 384–399.
Davis, K. H., Biddulph, R., and Balashek, H., 1952, Automatic recognition of spoken digits, J. Acoust. Soc. Am. 24: 637–642.
Dudley, H., and Balashek, S., 1958, Automatic recognition of phonetic patterns in speech, J. Acoust. Soc. Am. 30: 721–732.
Fant, C. G. M., 1960, Acoustic Theory of Speech Production, Mouton, s’Gravenhage.
Forgie, J. W., and Forgie, C. D., 1959, Results obtained from a vowel recognition computer program, J. Acoust. Soc. Am. 31: 1480–1489.
Fry, D. B., and Denes, P., 1958, The solution of some fundamental problems in mechanical speech recognition, Language and Speech 1: 35–58.
Fujisaki, H., and Kawashima, T., 1968, The roles of pitch and higher formants in the perception of vowels, IEEE Trans. Audio Electroacoust. AU-16: 73–77.
Gerstman, L. J., 1968, Classification of self-normalized vowels, IEEE Trans. Audio Electroacoust. AU-16: 78–80.
Green, P. D., 1971, Temporal characteristics of spoken consonants as discriminants in automatic speech recognition, Ph.D. Thesis, University of Keele.
Green, P. D., and Ainsworth, W. A., 1973, Towards the automatic recognition of spoken Basic English, Machine Perception of Patterns and Pictures, Inst. of Physics Conf. Series No. 13, p. 161–168.
Gregory, R. L., 1970, The Intelligent Eye, Weidenfeld and Nicolson, London.
Halle, M., and Stevens, K. N., 1962, Speech recognition: a model and a program for research, IRE Trans. Information Theory IT-8: 155–159.
Jakobson, R., Fant, C. G. M., and Halle, M., 1952, Preliminaries to Speech Analysis, MIT Tech. Report No. 13.
Klatt, D. H., and Stevens, K. N., 1973, On the automatic recognition of continuous speech, IEEE Trans. Audio Electroacoust. AU-21: 210–217.
Lavington, S. H., 1968, Measurement systems for automatic speech recognition, Ph.D. Thesis, University of Manchester.
Lawrence, W., (1953), The synthesis of speech from signals which have a low information rate, Communication Theory (W. Jackson, ed.), Butterworths, London, 460–469.
Lea, W. A., Medress, M. F., and Skinner, T. E., 1974, A prosodically-guided speech understanding strategy, IEEE Symp. Speech Recognition, 38–44.
Lesser, V. R., Fennel, R. D., Erman, L. D. and Reddy, D. R., 1974, Organization of the HEARSAY II speech understanding system, IEEE Symp. Speech Recognition, 11–21.
Licklider, J. C. R., and Pollack, I., 1948, Effects of differentiation, integration, and infinite peak dipping on the intelligibility of speech, J. Acoust. Soc. Am. 20: 42–51.
Lindblom, B. E. F., and Studdert-Kennedy, M., 1967, On the role of formant transitions in vowel recognition, J. Acoust. Soc. Am. 42: 830–843.
MacKay, D. M., 1952, Mentality in machines, Proc. Aristot. Soc. Suppt., 26: 61–86.
MacKay, D. M., 1967, Ways of looking at perception, Models for the Perception of Speech and Visual Form (W. Wathen-Dunn, ed.), MIT Press, Boston, 25–43.
Nash-Webber, B., 1974, Semantic support for a speech understanding system, IEEE Symp. Speech Recognition, 244–249.
Nelson, A. L., Werscher, M. B., Martin, T. B., Zadell, H. J., and Falter, J. W., 1967, Acoustic recognition by analog feature-abstraction techniques’ Models for Perception of Speech and Visual Form, (W. Wathen-Dunn, ed.), MIT Press, Boston, 428–439.
Newell, A., Barnett, J., Forgie, J. W., Green, C, Klatt, D., Licklider, J. C. R., Munson, J., Reddy, D. R., and Woods, W. A., 1973, Speech Understanding System North-Holland Publishing Co.
Öhman, S. E. G., 1966, Perception of segments of VCCV utterances, J. Acoust. Soc. Am., 40: 978–988.
Oshika, B. T., Zue, V. W., Weeks, R. V., Neu, H., and Aurbach, J., 1974, The role of phonological rules in speech understanding research, IEEE Symp. Speech Recognition, 204–207.
Paul, J. E., and Rabinowitz, A. S., 1974, An acoustically based continuous speech recognition system, IEEE Symp. Speech Recognition, 63–67.
Paxton, W. H., 1974, A best-first parser, IEEE Symp. Speech Recognition, 218–225.
Peterson, G. E., and Barney, H. L., 1952, Control methods used in a study of the vowels, J. Acoust. Soc. Am. 24: 175–184.
Pollack, I., and Pickett, J., 1964, The intelligibility of excerpts from conversation, Language and Speech 6, 165–171.
Potter, R. K., Kopp, G. A., and Green, H. C, 1947, Visible Speech, van Nostrand, New York.
Purton, R. F., 1968, Speech recognition using autocorrelation analysis, IEEE Trans. Audio Electroacoust. AU-16: 235–239.
Reddy, D. R., 1967, Computer recognition of connected speech, J. Acoust. Soc. Am., 44: 329–347.
Reddy, D. R., Erman, L. D., and Neely, R. B., 1973, A model and a system for machine recognition of speech, IEEE Trans. Audio Electroacoust. AU-21: 229–238.
Ritea, H. B., 1974, A voice-controlled data management system, IEEE Symp. Speech Recognition, 28–31.
Rovner, P., Nash-Webber, R., and Words, W. A., 1974, Control concepts in a speech understanding system, IEEE Symp. Speech Recognition, 267–272.
Sakai, T., and Doshita, S., 1962, The phonetic typewriter, Proc. IFIP Congress, Munich.
Tappert, C. C., 1974, Experiments with a tree search method for converting noisy phonetic representation into standard orthography, IEEE Symp. Speech Recognition, pp. 261–266.
Tappert, C. C., Dixon, N. R., and Rabinowitz, A. S., 1973, Application of sequential decoding for converting phonetic to graphic representation in automatic recognition of continuous speech (ARCS), IEEE Trans. Audio Electroacoust. AU-21: 225–229.
Teacher, C. F., Kellett, H., and Focht, L., 1967, Experimental, limited vocabulary, speech recognizer, IEEE Intern. Conv. Record (Part III), 169–173.
Walker, D. E., 1974, The SRI speech understanding system, IEEE Symp. Speech Recognition, pp.32–37.
Winograd, T., 1972, Understanding Natural Language, Edinburgh University Press, Edinburgh.
Winston, P. H., 1972, The MIT robot, Machine Intelligence 7: 431–463.
Wiren, J., and Stubbs, H. L., 1956, Electronic binary selection system for phoneme classification, J. Acoust. Soc. Am. 28: 1082–1091.
Woods, W. A., 1974, Motivation and overview of BBN SPEECHLIS, an experimental prototype for speech understanding research, IEEE Symp. Speech Recognition, pp. 1–10.
Woods, W. A., and Makhoul, J., 1974, Mechanical inference problems in continuous speech understanding, Artific. Intell. 5: 73.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1978 Plenum Press, New York
About this chapter
Cite this chapter
Ainsworth, W.A., Green, P.D. (1978). Current Problems in Automatic Speech Recognition. In: Batchelor, B.G. (eds) Pattern Recognition. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-4154-3_14
Download citation
DOI: https://doi.org/10.1007/978-1-4613-4154-3_14
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-4156-7
Online ISBN: 978-1-4613-4154-3
eBook Packages: Springer Book Archive