Abstract
In this paper we present a Spoken Dialog System (SDS) with a Microphone Array (MA). Our goal is to create a hands-free home automation system with a speech interface to control home devices. The MA interface enables to create ubiquitous speech acquisition for the SDS. The implemented system allows any user – in any position in a room – to establish a dialog with a virtual butler that is able to control a wide range of home appliances (room lights, air-conditioner, windows shades and hi-fi features). This virtual butler has a 3D animated face that is, while the dialog is engaged, able to steer to the user’s position and respond to his/hers commands with synthesized speech. The presented results show that the MA, as distant talk interface, performs quite well and is a step towards a more realistic human-machine interaction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
CHIL - Computers. In: the Human Interaction Loop, http://chil.server.de/
AMI - Augmented Multi-party Interaction, http://www.amiproject.org/
DICIT - Distant-talking Interfaces for Control of Interactive TV, http://dicit.fbk.eu/
Neto, J.P., Cassaca, R., Viveiros, M., Mourão, M.: Design of a Multimodal Input Interface for a Dialog System. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds.) PROPOR 2006. LNCS (LNAI), vol. 3960, pp. 170–179. Springer, Heidelberg (2006)
Meinedo, H., Caseiro, D., Neto, J., Trancoso, I.: AUDIMUS.media: a Broadcast News speech recognition system for the European Portuguese language. In: Mamede, N.J., Baptista, J., Trancoso, I., Nunes, M.d.G.V. (eds.) PROPOR 2003. LNCS, vol. 2721. Springer, Heidelberg (2003)
Paulo, S., Oliveir, L.C.: Reducing the Corpus-based TTS Signal Degradation Due to Speaker’s Word Pronunciations. In: Interspeech, ISCA, Portugal, pp. 1089–1092 (2005)
Viveiros, M.: Cara Falante - Uma interface visual para um sistema de diálogo falado, Graduation thesis, Instituto Superior Técnico, Universidade Técnica de Lisboa (2004)
Brandstein, M., Ward, D.: Microphone Arrays. Springer, Heidelberg (2001)
Kellermann, W., Buchner, H., Herbordt, W., Aichner, R.: Multichannel Acoustic Signal Processing for Human/Machine Interfaces - Fundamental Problems and Recent Advances. In: ICA 2004. LNCS, vol. 3195, Springer, Heidelberg (2004)
Buchner, H., Benesty, J., Kellermann, W.: Generalized Multichannel Frequency-Domain Adaptive Filtering: Efficient Realization and Application to Hands-Free Speech Communication. Signal Processing 85, 549–570 (2005)
The Nist Mark-III Microphone Array, http://www.nist.gov/smartspace/cmaiii.html
Coelho, G.E., Serralheiro, A.J., Neto, J.: Microphone Array front-end interface for Home Automation. In: Hands-free Speech Communication and Microphone Arrays (HSCMA), Trento, Italy, pp. 184–187 (2008)
Johnson, D.H., Dudgeon, D.E.: Array Signal Processing: Concepts and Techniques. Prentice Hall, Englewood Cliffs (1993)
Knapp, C., Carter, G.: The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Signal Processing 24, 320–327 (1976)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Coelho, G.E., Serralheiro, A.J., Neto, J.P. (2008). A Spoken Dialog System Speech Interface Based on a Microphone Array. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-85980-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85979-6
Online ISBN: 978-3-540-85980-2
eBook Packages: Computer ScienceComputer Science (R0)