Skip to main content
Log in

Gestural cue analysis in automated semantic miscommunication annotation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The automated annotation of conversational video by semantic miscommunication labels is a challenging topic. Although miscommunications are often obvious to the speakers as well as the observers, it is difficult for machines to detect them from the low-level features. We investigate the utility of gestural cues in this paper among various non-verbal features. Compared with gesture recognition tasks in human-computer interaction, this process is difficult due to the lack of understanding on which cues contribute to miscommunications and the implicitness of gestures. Nine simple gestural features are taken from gesture data, and both simple and complex classifiers are constructed using machine learning. The experimental results suggest that there is no single gestural feature that can predict or explain the occurrence of semantic miscommunication in our setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Alatan AA, Akansu AN, Wolf W (2001) Multi-modal dialog scene detection using hidden Markov models for content-based multimedia indexing. Multimed Tools Appl 14:137–151

    Article  MATH  Google Scholar 

  2. Brown D, Parks JC (1972) Interpreting nonverbal behavior, a key to more effective counseling: review of literature. Rehabil Couns Bull 15(3):176–184

    Google Scholar 

  3. Burgoon J, Adkins M, Kruse J, Jensen ML, Meservy T, Twitchell DP, Deokar A, Nunamaker JF, Lu S, Tsechpenakis G, Metaxas DN, Younger RE (2005) An approach for intent identification by building on deception detection. In: Hawaii international conference on system sciences 1, 21a

  4. Buttny R (2004) Talking problems. State University of New York Press, New York

    Google Scholar 

  5. Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Accessed 24 May 2010

  6. Erol A, Bebis G, Nicolescu M, Boyle RD, Twombly X (2007) Vision-based hand pose estimation: a review. Comput Vis Image Underst 108(1–2):52–73

    Article  Google Scholar 

  7. Freedman N (1977) Hands, word and mind: on the structuralization of body movements during discourse and the capacity for verbal representation. Plenum Press, New York, pp 219–235

    Google Scholar 

  8. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York

    MATH  Google Scholar 

  9. Heath C (1986) Body movement and speech in medical interaction. Cambridge University Press, Cambridge

    Book  Google Scholar 

  10. Jaimes A, Sebe N (2007) Multimodal human-computer interaction: a survey. Comput Vis Image Underst 108(1–2):116–134

    Article  Google Scholar 

  11. McNeill D (1992) Hand and mind. The University of Chicago Press, Chicago

    Google Scholar 

  12. McNeill D (2008) Annotative practice. Available at: http://mcneilllab.uchicago.edu/pdfs/susan_duncan/Annotative_practice_REV-08.pdf

  13. Meservy TO, Jensen ML, Kruse J, Twitchell DP, Tsechpenakis G, Burgoon JK, Metaxas DN, Nunamaker JF Jr (2005) Deception detection through automatic, unobtrusive analysis of nonverbal behavior. IEEE Intell Syst 20:36–43

    Article  Google Scholar 

  14. Mortensen CD (2006) Human conflict. Rowman & Littlefield, Cambridge

    Google Scholar 

  15. Pavlovic VI, Sharma R, Huang TS (1997) Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans Pattern Anal Mach Intell 19:677–695

    Article  Google Scholar 

  16. Rahman AM, Hossain MA, Parra J, Saddik AE (2009) Motion-path based gesture interaction with smart home services. In: Proceedings of the seventeen ACM international conference on multimedia, MM ’09, pp 761–764

  17. Suchman L, Jordan B (1990) Interactional troubles in face-to-face survey interviews. J Am Stat Assoc 85(409):232–241

    Google Scholar 

  18. Takeuchi H, Subramaniam LV, Nasukawa T, Roy S (2009) Getting insights from the voices of customers: conversation mining at a contact center. Inf Sci 179(11):1584–1591

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the m-project members especially Kunio Tanabe and Tomoko Matsui for commenting on an earlier version of this paper. This research was partially supported by the Grant-in-Aid for Scientific Research 19530620, 21500266, the National Science Foundations under Grant CCF-0958490 and the National Institute of Health under Grant 1-RC2-HG005668-01, and the Function and Induction Research Project, Transdisciplinary Research Integration Center of the Research Organization of Information and Systems.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masashi Inoue.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Inoue, M., Ogihara, M., Hanada, R. et al. Gestural cue analysis in automated semantic miscommunication annotation. Multimed Tools Appl 61, 7–20 (2012). https://doi.org/10.1007/s11042-010-0701-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-010-0701-1

Keywords

Navigation