Skip to main content

Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

  • Chapter
Life-Like Characters

Summary

Galatea is a software toolkit to develop a human-like spoken dialog agent. In order to easily integrate the modules of different characteristics including speech recognizer, speech synthesizer, facial animation synthesizer, and dialog controller, each module is modeled as a virtual machine having a simple common interface and connected to each other through a broker (communication manager). Galatea employs model-based speech and facial animation synthesizers whose model parameters are adapted easily to those for an existing person if his or her training data is given. The software toolkit that runs on both UNIX/Linux and Windows operating systems will be publicly available in the middle of 2003 [7, 6].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adachi, H., Katsurada, K., Yamada, H., Nitta, T.: Development of a prototyping tool for MMI systems. In: Information Processing Society of Japan,Technical Report 2002-SLP-43 (in Japanese) (2002) pp 7–12

    Google Scholar 

  2. Cassell, J., Bickmore, T., Campbell, L., Chang, K., Vilhjâlmsson, H., Yan, H.: Requirements for an architecture for embodied conversational characters. In: Proceedings of Computer Animation and Simulation ‘89 (Eurographics Series), ed Thalmann, D., Thalmann, N. (1999) pp 109–122

    Chapter  Google Scholar 

  3. DARPA Communicator Program. http://fofoca.mitre.org/(1998)

  4. Dohi, H., Ishizuka, M.: Visual Software Agent: A realistic face-to-face style interface connected with WWW/Netscape. In: IJCAI Workshop on Intelligent Multimodal Systems (1997) pp 17–22

    Google Scholar 

  5. Ekman, P., Friesen, W.V.: Facial Action Coding System (FAGS): A technique for the measurement of facial action ( Consulting Psychologists Press, Palo Alto, CA 1978 )

    Google Scholar 

  6. Galatea Toolkit. http://hil.t.u-tokyo.ac.jp/-galatea/(2002)

  7. Galatea Toolkit. http://iipl.jaist.ac.jp/IPA/(2002)

  8. Gustafson, J., Lindberg, N., Lundeberg, M.: The August spoken dialogue system. In: EuroSpeech (1999) pp 1151–1154

    Google Scholar 

  9. HMM-Based Speech Synthesis Toolkit. http://hts.ics.nitech.ac.jp/(2002)

  10. Julia, L., Cheyer, A.: Is talking to virtual more realistic? In: EuroSpeech (1999) pp 1719–1722

    Google Scholar 

  11. Katsurada, K., Otani, Y., Nakamura, Y., Kobayashi, S., Yamada, H., Nitta, T.: A modality-independent MMI system architecture. In: Proceedings ICSLP (2002) pp 2549–2552

    Google Scholar 

  12. Kawahara, T., Kobayashi, T., Takeda, T., Minematsu, N., Itou, K., Yamamoto, M., Utsuro, T., Shikano, K.: Sharable software repository for Japanese large vocabulary continuous speech recognition. In: Proceedings ICSLP (1998) pp 3257–3260

    Google Scholar 

  13. MMI Description Language XISL. http://www.vox.tutkie.tut.ac.jp/XISL/XISLE.html(2002)

  14. Morishima, S.: Face analysis and synthesis. IEEE Signal Processing Magazine 18 (3): 26–34 (2001)

    Article  Google Scholar 

  15. Morphological Analyzer ChaSen. http://chasen.aist-nara.ac.jp/index.html.en(2000)

    Google Scholar 

  16. Nishimoto, T., Araki, M., Niimi, Y.: RadioDoc: A voice-accessible document system. In: Proceedings ICSLP (2002) pp 1485–1488

    Google Scholar 

  17. The Open Agent Architecture. http://www.ai.sri.com/-oaa/(2001)

  18. Sakamoto, K., Hinode, H., Watanuki, K., Seki, S., Kiyama, J., Togawa, F.: A response model for a CG character based on timing of interactions in a multimodal human interface. In: IUI-97 (1997) pp 257–260

    Google Scholar 

  19. Seneff, S., Hurley, E., Lau, R., Pao, C., Schmid, P., Zue, V.: GALAXY-II: A referece architecture for conversational system development. In: Proceedings ICSLP (1998) pp 931–934

    Google Scholar 

  20. Standard of symbols for Japanese text-to-speech synthesizer: JEIDA-62–2000 (2000)

    Google Scholar 

  21. Sutton, S., Cole, R.: Universal speech tools: The CSLU Toolkit. In: Proceedings ICSLP (1998) pp 3221–3224

    Google Scholar 

  22. Tamura, M., Masuko, T., Tokuda, K., Kobayashi, T.: Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR. In: ICASSP,Vol. 2 (2001) pp 805–808

    Google Scholar 

  23. Tamura, M., Masuko, T., Tokuda, K., Kobayashi, T.: Text-to-speech synthesis with arbitrary speaker’s voice from average voice. In: Proceedings of European Conference on Speech Communication and Technology,Vol. 1 (2001) pp 345–348

    Google Scholar 

  24. Ushida, H., Hirayama, Y., Nakajima, H.: Emotion model for life-like agent and its evaluation. In: AAAI-98 (1998) pp 62–69

    Google Scholar 

  25. Voice eXtensible Markup Language (VoiceXML), Version 1.0. http://www.voicexml.org(2000)

  26. XSL Transformations (XSLT), Version 1.0. http://www.w3.org/TR/xslt (1999)

  27. Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T.: Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis. In: EuroSpeech,Vol. 5 (1999) pp 2347–2350

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kawamoto, Si. et al. (2004). Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents. In: Prendinger, H., Ishizuka, M. (eds) Life-Like Characters. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-08373-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-08373-4_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05655-0

  • Online ISBN: 978-3-662-08373-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics