Abstract
When human beings converse, they alternate between talking and listening. Participating in such turntaking behaviors is more difficult for machines that use speech recognition to listen and speech output to talk. This paper describes an algorithm for managing such turn-taking through the use of a sliding capture window. The device is specific to discrete speech recognition technologies that do not have access to echo cancellation. As such, it addresses those inexpensive applications that suffer the most from turn-taking errors—providing a “speech button” that stabilizes the interface. Correcting for short-lived turn-taking errors can be thought of as “debouncing” the button. An informal study based on ten subjects using a voice dialing application illuminates the design.
Similar content being viewed by others
Rights and permissions
About this article
Cite this article
Balentine, B.E., Ayer, C.M., Miller, C.L. et al. Debouncing the speech button: A sliding capture window device for synchronizing turn-taking. Int J Speech Technol 2, 7–19 (1997). https://doi.org/10.1007/BF02539819
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02539819