Summary
Galatea is a software toolkit to develop a human-like spoken dialog agent. In order to easily integrate the modules of different characteristics including speech recognizer, speech synthesizer, facial animation synthesizer, and dialog controller, each module is modeled as a virtual machine having a simple common interface and connected to each other through a broker (communication manager). Galatea employs model-based speech and facial animation synthesizers whose model parameters are adapted easily to those for an existing person if his or her training data is given. The software toolkit that runs on both UNIX/Linux and Windows operating systems will be publicly available in the middle of 2003 [7, 6].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adachi, H., Katsurada, K., Yamada, H., Nitta, T.: Development of a prototyping tool for MMI systems. In: Information Processing Society of Japan,Technical Report 2002-SLP-43 (in Japanese) (2002) pp 7–12
Cassell, J., Bickmore, T., Campbell, L., Chang, K., Vilhjâlmsson, H., Yan, H.: Requirements for an architecture for embodied conversational characters. In: Proceedings of Computer Animation and Simulation ‘89 (Eurographics Series), ed Thalmann, D., Thalmann, N. (1999) pp 109–122
DARPA Communicator Program. http://fofoca.mitre.org/(1998)
Dohi, H., Ishizuka, M.: Visual Software Agent: A realistic face-to-face style interface connected with WWW/Netscape. In: IJCAI Workshop on Intelligent Multimodal Systems (1997) pp 17–22
Ekman, P., Friesen, W.V.: Facial Action Coding System (FAGS): A technique for the measurement of facial action ( Consulting Psychologists Press, Palo Alto, CA 1978 )
Galatea Toolkit. http://hil.t.u-tokyo.ac.jp/-galatea/(2002)
Galatea Toolkit. http://iipl.jaist.ac.jp/IPA/(2002)
Gustafson, J., Lindberg, N., Lundeberg, M.: The August spoken dialogue system. In: EuroSpeech (1999) pp 1151–1154
HMM-Based Speech Synthesis Toolkit. http://hts.ics.nitech.ac.jp/(2002)
Julia, L., Cheyer, A.: Is talking to virtual more realistic? In: EuroSpeech (1999) pp 1719–1722
Katsurada, K., Otani, Y., Nakamura, Y., Kobayashi, S., Yamada, H., Nitta, T.: A modality-independent MMI system architecture. In: Proceedings ICSLP (2002) pp 2549–2552
Kawahara, T., Kobayashi, T., Takeda, T., Minematsu, N., Itou, K., Yamamoto, M., Utsuro, T., Shikano, K.: Sharable software repository for Japanese large vocabulary continuous speech recognition. In: Proceedings ICSLP (1998) pp 3257–3260
MMI Description Language XISL. http://www.vox.tutkie.tut.ac.jp/XISL/XISLE.html(2002)
Morishima, S.: Face analysis and synthesis. IEEE Signal Processing Magazine 18 (3): 26–34 (2001)
Morphological Analyzer ChaSen. http://chasen.aist-nara.ac.jp/index.html.en(2000)
Nishimoto, T., Araki, M., Niimi, Y.: RadioDoc: A voice-accessible document system. In: Proceedings ICSLP (2002) pp 1485–1488
The Open Agent Architecture. http://www.ai.sri.com/-oaa/(2001)
Sakamoto, K., Hinode, H., Watanuki, K., Seki, S., Kiyama, J., Togawa, F.: A response model for a CG character based on timing of interactions in a multimodal human interface. In: IUI-97 (1997) pp 257–260
Seneff, S., Hurley, E., Lau, R., Pao, C., Schmid, P., Zue, V.: GALAXY-II: A referece architecture for conversational system development. In: Proceedings ICSLP (1998) pp 931–934
Standard of symbols for Japanese text-to-speech synthesizer: JEIDA-62–2000 (2000)
Sutton, S., Cole, R.: Universal speech tools: The CSLU Toolkit. In: Proceedings ICSLP (1998) pp 3221–3224
Tamura, M., Masuko, T., Tokuda, K., Kobayashi, T.: Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR. In: ICASSP,Vol. 2 (2001) pp 805–808
Tamura, M., Masuko, T., Tokuda, K., Kobayashi, T.: Text-to-speech synthesis with arbitrary speaker’s voice from average voice. In: Proceedings of European Conference on Speech Communication and Technology,Vol. 1 (2001) pp 345–348
Ushida, H., Hirayama, Y., Nakajima, H.: Emotion model for life-like agent and its evaluation. In: AAAI-98 (1998) pp 62–69
Voice eXtensible Markup Language (VoiceXML), Version 1.0. http://www.voicexml.org(2000)
XSL Transformations (XSLT), Version 1.0. http://www.w3.org/TR/xslt (1999)
Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T.: Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis. In: EuroSpeech,Vol. 5 (1999) pp 2347–2350
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kawamoto, Si. et al. (2004). Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents. In: Prendinger, H., Ishizuka, M. (eds) Life-Like Characters. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-08373-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-662-08373-4_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05655-0
Online ISBN: 978-3-662-08373-4
eBook Packages: Springer Book Archive