Skip to main content

Towards Learning Nonverbal Identities from the Web: Automatically Identifying Visually Accentuated Words

  • Conference paper
Intelligent Virtual Agents (IVA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8637))

Included in the following conference series:

  • 2955 Accesses

Abstract

This paper presents a novel long-term idea to learn automatically from online multimedia content, such as videos from YouTube channels, a portfolio of nonverbal identities in the form of computational representation of prototypical gestures of a speaker. As a first step towards this vision, this paper presents proof-of-concept experiments to automatically identify visually accentuated words from a collection of online videos of the same person. The experimental results are promising with many accentuated words automatically identified and specific head motion patterns were associated with these words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Douville, B., Prevost, S., And Stone, M.: Animated conversation: Rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents 413–420 (1994)

    Google Scholar 

  2. Decarlo, D., Stone, M., Revilla, C., And Venditti, J.J.: Specifying and animating fa-cial signals for discourse in embodied conversational agents. Computer Animation and Virtual Worlds 15(1), 27–38 (2004)

    Article  Google Scholar 

  3. Bergmann, K., And Kopp, S.: Increasing the expressiveness of virtual agents: auto-nomous generation of speech and gesture for spatial description tasks. In: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems-, vol. 1, pp. 361–368 (2009)

    Google Scholar 

  4. Neff, M., Kipp, M., Albrecht, I., And Seidel, H.-P.: Gesture modeling and animation based on a probabilistic recreation of speaker style. ACM Transactions on Graphics 27(1), 5 (2008)

    Article  Google Scholar 

  5. Lee, J., Marsella, S.C.: Nonverbal behavior generator for embodied conversational agents. In: Gratch, J., Young, M., Aylett, R.S., Ballin, D., Olivier, P. (eds.) IVA 2006. LNCS (LNAI), vol. 4133, pp. 243–255. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Stone, M., Decarlo, D., Oh, I., Rodriguez, C., Stere, A., Lees, A., Bregler, C.: Speaking with hands: Creating animated conversational characters from recordings of human performance. In: Proc. SIGGRAPH 2004, pp. 506–513 (2004)

    Google Scholar 

  7. Busso, C., Deng, Z., Grimm, M., Neumann, U., And Narayanan, S.: Rigid head motion in expressive speech animation: Analysis and synthesis. IEEE Transactions on Audio, Speech, and Language Processing 15(3), 1075–1086 (2007)

    Article  Google Scholar 

  8. Albrecht, I., Haber, J., Peter Seidel, H.: Automatic generation of non-verbal facial expressions from speech. In: Proc. Computer Graphics International 2002, pp. 283–293 (2002)

    Google Scholar 

  9. Levine, S., Krähenbühl, P., Thrun, S., And Koltun, V.: Gesture controllers. In: ACM SIGGRAPH 2010 papers, SIGGRAPH 2010, pp. 124:1–124:11. ACM, New York (2010)

    Google Scholar 

  10. Yuan, J., Liberman, M.: Speaker identification on the SCOTUS corpus. In: Proceedings of Acoustics, pp. 5687–5690 (2008)

    Google Scholar 

  11. Xiong, X., De la Torre, F.: Supervised descent method and its applica-tions to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2013)

    Google Scholar 

  12. Brand, M.: Voice puppetry. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1999, pp. 21–28. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA (1999)

    Chapter  Google Scholar 

  13. Cassel, J., Vilhjálmsson, H., And Bickmore, T.: BEAT: The Behavior Expression Animation Toolkit. In: Proc. SIGGRAPH 2001, pp. 477–486 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zadeh, A.B., Sagae, K., Morency, L.P. (2014). Towards Learning Nonverbal Identities from the Web: Automatically Identifying Visually Accentuated Words. In: Bickmore, T., Marsella, S., Sidner, C. (eds) Intelligent Virtual Agents. IVA 2014. Lecture Notes in Computer Science(), vol 8637. Springer, Cham. https://doi.org/10.1007/978-3-319-09767-1_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09767-1_60

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09766-4

  • Online ISBN: 978-3-319-09767-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics