The Perception of Synthetic Speech in Noise
Although much information about synthetic speech has been acquired over past decades, we have been unable to find in the literature a systematic examination of the perception of synthetic speech in noise. Simpson  has reported that synthetic speech altitude callouts to airline pilots in widebody jet cockpit noise at a S/N of -10 dB for the first time were 99.1% intelligible and that synthetic speech voice warnings to helicopter pilots in simulated helicopter noise at a S/N ratio of -22 dB were 99.2% intelligible . Nusbaum  has reported that perceptual confusions for synthetic CV and VC syllables were quite different than confusions observed for natural speech degraded by noise. Pisoni (personal communication) indicates that one of two synthetic speech systems with very high levels of segmental intelligibility in quiet, showed greater decrements in the intelligibility of CV syllables in noise than did the other system. Clark  reported little difference in the intelligibility of vowels in noise for synthetic and natural speech, whereas natural CV syllables were clearly superior to synthetic CV syllables under all noise conditions.
KeywordsNatural Speech Synthetic Speech Linear Predictive Code Delta Modulation Digital Speech
Unable to display preview. Download preview PDF.
- 1.C. Simpson, “Synthesized Voice Control Callouts for Air Transport Operations,” NASA CR-3300, NASA Ames Research Center (1980).Google Scholar
- 2.C. Simpson and C. Marchionda, “Synthesized Speech Rate and Pitch Effects on Intelligibility of Warning Messages for Pilots,” Proc. of the Second Symp. on Aviation Psychology, Department of Aviation, Ohio State University, Columbus, Ohio, USA (1983).Google Scholar
- 3.H. C. Nusbaum, M. J. Dedina, and D. B. Pisoni, “Perceptual Confusions of Consonants in Natural and Synthetic CV Syllables,” Speech Research Laboratory Technical Note 84-02, Bloomington, Indiana, USA (1984).Google Scholar
- 4.J. E. Clark, Intelligibility Comparisons for two synthetic and one natural speech source, J. Phonetics 11:37 (1983).Google Scholar
- 5.R. L. McKinley, T. R. Anderson, and T. J. Moore, “Evaluation of Speech Synthesis for Use in Military Noise Environments,” Proceedings of National Bureau of Standards Workshop on Standardization of I/O Technology, Gaithersburg, Maryland, USA (1982).Google Scholar
- 6.J. Freedman and W. A. Rumbaugh, “Accuracy and Speed of Response to Different Voice Types in A Cockpit Voice Warning System,” Air Force Institute of Technology Report LSSR 89-83, WPAFB, Ohio, USA.Google Scholar
- 7.R. L. McKinley and T. J. Moore, “The Effect of Audio Bandwidth on Selected Digital Speech Coding Algorithms,” MILCOM’ 85 Conference Record, IEEE, 345 East 47th Street, New York, New York, USA (1985).Google Scholar
- 11.L. M. Manous, M. J. Dedina, H. C. Nusbaum, and D. B. Pisoni, “Speeded Sentence Verification of Natural and Synthetic Speech,” Research on Speech Perception, Progress Report No. 11. Indiana University, Bloomington, Indiana, USA (1985).Google Scholar
- 12.P. A. Luce, T. C. Feustel, and D. B. Pisoni, Capacity demands in short-term memory for synthetic and natural word lists, Human Factors, 25:17 (1983).Google Scholar