Gesture Features for Coreference Resolution

  • Jacob Eisenstein
  • Randall Davis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4299)


If gesture communicates semantics, as argued by many psychologists, then it should be relevant to bridging the gap between syntax and semantics in natural language processing. One benchmark problem for computational semantics is coreference resolution: determining whether two noun phrases refer to the same semantic entity. Focusing on coreference allows us to conduct a quantitative analysis of the relationship between gesture and semantics, without having to explicitly formalize semantics through an ontology. We introduce a new, small-scale video corpus of spontaneous spoken-language dialogues, from which we have used computer vision to automatically derive a set of gesture features. The relevance of these features to coreference resolution is then discussed. An analysis of the timing of these features also enables us to present new findings on gesture-speech synchronization.


Noun Phrase Hand Position Focus Distance Comparative Feature Coreference Resolution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Grishman, R., Sundheim, B.: Design of the MUC-6 evaluation. In: Proceedings of the 6th Message Understanding Conference (1995)Google Scholar
  2. 2.
    McNeill, D.: Hand and Mind. The University of Chicago Press, Chicago (1992)Google Scholar
  3. 3.
    Adler, A., Eisenstein, J., Oltmans, M., Guttentag, L., Davis, R.: Building the design studio of the future. In: Making Pen-Based Interaction Intelligent and Natural, pp. 1–7. AAAI Press, Menlo Park (2004)Google Scholar
  4. 4.
    Huang, X., Alleva, F., Hwang, M.Y., Rosenfeld, R.: An overview of the Sphinx-II speech recognition system. In: Proceedings of ARPA Human Language Technology Workshop, pp. 81–86 (1993)Google Scholar
  5. 5.
    Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 126–133 (2000)Google Scholar
  6. 6.
    Hirschman, L., Chinchor, N.: MUC-7 coreference task definition. In: Message Understanding Conference Proceedings (1997)Google Scholar
  7. 7.
    Butterworth, B., Beattie, G.: Gesture and silence as indicators of planning in speech. In: Recent Advances in the Psychology of Language, pp. 347–360. Plenum Press, New York (1978)Google Scholar
  8. 8.
    Morrel-Samuels, P., Krauss, R.M.: Word familiarity predicts temporal asynchrony of hand gestures and speech. Journal of Experimental Psychology: Learning, Memory and Cognition 18, 615–623 (1992)CrossRefGoogle Scholar
  9. 9.
    Quek, F., McNeill, D., Bryll, R., Duncan, S., Ma, X.F., Kirbas, C., McCullough, K.E., Ansari, R.: Multimodal human discourse: gesture and speech. ACM Transactions on Computer-Human Interaction (TOCHI), 171–193 (2002)Google Scholar
  10. 10.
    Kopp, S., Tepper, P., Ferriman, K., Cassell, J.: Trading spaces: How humans and humanoids use speech and gesture to give directions. Spatial Cognition and Computation (in preparation, 2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jacob Eisenstein
    • 1
  • Randall Davis
    • 1
  1. 1.Computer Science and Artificial Intelligence LaboratoryMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations