Head X: Customizable Audiovisual Synthesis for a Multi-purpose Virtual Head
The development of embodied conversational agents (ECAs) involves a wide range of cutting-edge technologies extending from multimodal perception to reasoning to synthesis. While each is important to a successful outcome, it is the synthesis that has the most immediate impact on the observer. The specific appearance and voice of an embodied conversational agent (ECA) can be decisive factors in meeting its social objectives. In light of this, we have developed an extensively customizable system for synthesizing a virtual talking 3D head. Rather than requiring explicit integration into a codebase, our software runs as a service that can be controlled by any external client, which substantially simplifies its deployment into new applications. We have explored the benefits of this approach across several internal research projects and student exercises as part of a university topic on ECAs.
KeywordsEmbodied conversational agents audiovisual speech synthesis software library
Unable to display preview. Download preview PDF.
- 2.Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: SIGGRAPH 1999: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 187–194. ACM Press/Addison-Wesley Publishing Co., New York (1999)Google Scholar
- 3.Cassell, J.: Embodied conversational agents: representation and intelligence in user interface. AI Magazine 22(3), 67–83 (2001)Google Scholar
- 4.FaceGen SDK 3.6, http://www.facegen.com
- 6.Kopp, S., Sowa, T., Wachsmuth, I.: Imitation games with an artificial agents: From mimicking to understanding shape-related iconic gestures. In: Braffort, A., Gherbi, R., Gibet, S., Richardson, J., Teil, D. (eds.) Gesture-Based Communication in Human-Computer Interaction, pp. 436–447. Springer, Berlin (2004)CrossRefGoogle Scholar
- 7.Massaro, D.W.: From multisensory integration to talking heads and language learning. In: Calvert, G., Spence, C., Stein, B.E. (eds.) Advances in Natural, Multimodal Dialogue Systems, pp. 153–176. MIT Press, Cambridge (2004)Google Scholar
- 8.Microsoft Speech API 5.3, http://msdn.microsoft.com/speech
- 9.Milne, M., Luerssen, M., Lewis, T., Leibbrandt, R., Powers, D.: Development of a virtual agent based social tutor for children with autism spectrum disorders. In: Proc. 20th Int. Joint Conf. on Neural Networks, pp. 1555–1563. IEEE, Los Alamitos (2010)Google Scholar
- 13.Powers, D., Luerssen, M., Lewis, T., Leibbrandt, R., Milne, M., Pashalis, J., Treharne, K.: MANA for the Aging. In: Proceedings of the 2010 Workshop on Companionable Dialogue Systems, ACL 2010, pp. 7–12. ACL (2010)Google Scholar