Human–Computer Interaction

  • Dennis LinEmail author
  • Vuong Le
  • Thomas Huang


This chapter focuses on looking at users to build more intuitive and friendly interfaces. We will cover two general types of input modalities: tracking of the head and eyes, and tracking of the hand, especially of the fingers. Head pose and eye–gaze estimation are useful in allowing the computer to understand the direction of the user’s focus. We will provide an overview of the many techniques that researchers have applied to this task. We will also consider the automated analysis of hand and finger gestures. These are more active modalities, designed to communicate and issue commands to the computer. We will provide a taxonomy of gestures in the context of human–computer interaction and survey the field of techniques. Finally, we will discuss possible applications of these input modalities. In general, we conclude that existing systems are not yet mature, but that there is great potential for future research.


Sign Language Gesture Recognition American Sign Language Hand Shape Locality Sensitive Hashing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Alon, J., Athitsos, V., Yuan, Q., Sclaroff, S.: A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 31(9), 1685–1699 (2009) CrossRefGoogle Scholar
  2. 2.
    Ashdown, M., Oka, K., Sato, Y.: Combining head tracking and mouse input for a GUI on multiple monitors. In: CHI ’05: CHI ’05 Extended Abstracts on Human Factors in Computing Systems, pp. 1188–1191. ACM, New York (2005) CrossRefGoogle Scholar
  3. 3.
    Athitsos, V., Sclaroff, S.: Estimating 3d hand pose from a cluttered image. In: Computer Vision and Pattern Recognition, vol. 2, pp. 432–439 (June 2003) Google Scholar
  4. 4.
    Beymer, D., Flickner, M.: Eye gaze tracking using an active stereo head. In: Computer Vision and Pattern Recognition, 2003. Proceedings. 2003. IEEE Computer Society Conference on, vol. 2, pp. 451–458 (June 2003) Google Scholar
  5. 5.
    Bowden, R., Windridge, D., Kadir, T., Zisserman, A., Brady, M.: A linguistic feature vector for the visual interpretation of sign language. In: Pajdla, T., Matas, J. (eds.) Computer Vision – ECCV 2004. Lecture Notes in Computer Science, vol. 3021, pp. 390–401. Springer, Berlin (2004) CrossRefGoogle Scholar
  6. 6.
    Bray, M., Koller-Meier, E., Schraudolph, N.N., Van Gool, L.: Fast stochastic optimization for articulated structure tracking. Image Vis. Comput. 25(3), 352–364 (2007). Articulated and Non-rigid motion CrossRefGoogle Scholar
  7. 7.
    Cabral, M.C., Morimoto, C.H., Zuffo, M.K.: On the usability of gesture interfaces in virtual reality environments. In: CLIHC ’05: Proceedings of the 2005 Latin American Conference on Human–Computer Interaction, pp. 100–108. ACM, New York (2005) CrossRefGoogle Scholar
  8. 8.
    de La Gorce, M., Paragios, N., Fleet, D.J.: Model-based hand tracking with texture, shading and self-occlusions. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8 (23–28 June 2008) Google Scholar
  9. 9.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. Comput. Vis. Image Underst. 108(1–2), 52–73 (2007) CrossRefGoogle Scholar
  10. 10.
    Fu, Y., Huang, T.S.: Graph embedded analysis for head pose estimation. In: Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference on, pp. 6–8 (April 2006) Google Scholar
  11. 11.
    Gee, A., Cipolla, R.: Determining the gaze of faces in images. Image Vis. Comput. 12(10), 639–647 (1994) CrossRefGoogle Scholar
  12. 12.
    Gourier, N., Hall, D., Crowley, J.: Estimating face orientation using robust detection of salient facial features. In: Proceedings of Pointing 2004, ICPR, International Workshop on Visual Observation of Deictic Gestures, pp. 17–25 (2004) Google Scholar
  13. 13.
    Guestrin, E.D., Eizenman, M.: General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Trans. Biomed. Eng. 53, 1124–1133 (2006) CrossRefGoogle Scholar
  14. 14.
    Hansen, D.W., Pece, A.E.C.: Eye tracking in the wild. Comput. Vis. Image Underst. 98, 155–181 (2005) CrossRefGoogle Scholar
  15. 15.
    Hansen, D.W., Skovsgaard, H.H.T., Hansen, J.P., Møllenbach, E.: Noise tolerant selection by gaze-controlled pan and zoom in 3d. In: ETRA ’08: Proceedings of the 2008 Symposium on Eye Tracking Research & Applications, pp. 205–212. ACM, New York (2008) CrossRefGoogle Scholar
  16. 16.
    Hansen, D.W., Hansen, J.P., Nielsen, M., Johansen, A.S., Stegmann, M.B.: Eye typing using Markov and active appearance models. In: Applications of Computer Vision, 2002. (WACV 2002). Proceedings. Sixth IEEE Workshop on, pp. 132–136 (2002) CrossRefGoogle Scholar
  17. 17.
    Hansen, D.W., Ji, Q.: In the eye of the beholder: A survey of models for eyes and gaze. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 478–500 (2010) CrossRefGoogle Scholar
  18. 18.
    Heap, T., Hogg, D.: Towards 3d hand tracking using a deformable model. In: Automatic Face and Gesture Recognition, 1996, Proceedings of the Second International Conference on, Killington, VT, pp. 140–145 (14–16 October 1996) CrossRefGoogle Scholar
  19. 19.
    Huang, J., Shao, X., Wechsler, H.: Face pose discrimination using support vector machines (SVM). In: Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on, vol. 1, pp. 154–156 (August 1998) CrossRefGoogle Scholar
  20. 20.
    Isard, M., Blake, A.: Condensation – conditional density propagation for visual tracking. Int. J. Comput. Vis. 29(1), 5–28 (1998) CrossRefGoogle Scholar
  21. 21.
    Ji, Q., Zhu, Z.: Eye and gaze tracking for interactive graphic display. In: Proceedings of the 2nd International Symposium on Smart Graphics. SMARTGRAPH ’02, pp. 79–85. ACM, New York (2002) CrossRefGoogle Scholar
  22. 22.
    Kammerer, Y., Scheiter, K., Beinhauer, W.: Looking my way through the menu: the impact of menu design and multimodal input on gaze-based menu selection. In: ETRA ’08: Proceedings of the 2008 Symposium on Eye Tracking Research & Applications, pp. 213–220. ACM, New York (2008) CrossRefGoogle Scholar
  23. 23.
    Li, Y., Gong, S., Sherrah, J., Liddell, H.: Support vector machine based multi-view face detection and recognition. Image Vis. Comput. 22(5), 413–427 (2004) CrossRefGoogle Scholar
  24. 24.
    Matsumoto, Y., Ogasawara, T., Zelinsky, A.: Behavior recognition based on head pose and gaze direction measurement. In: Intelligent Robots and Systems, 2000 (IROS 2000). Proceedings. 2000 IEEE/RSJ International Conference on, vol. 3, pp. 2127–2132 (2000) Google Scholar
  25. 25.
    Morency, L.-P., Darrell, T.: Head gesture recognition in intelligent interfaces: The role of context in improving recognition. In: IUI ’06: Proceedings of the 11th International Conference on Intelligent User Interfaces, pp. 32–38. ACM, New York (2006) CrossRefGoogle Scholar
  26. 26.
    Morimoto, C.: Pupil detection and tracking using multiple light sources. Image Vis. Comput. 18(4), 331–335 (2000) CrossRefGoogle Scholar
  27. 27.
    Morimoto, C.H., Mimica, M.R.M.: Eye gaze tracking techniques for interactive applications. Comput. Vis. Image Underst. 98(1), 4–24 (2005). Special Issue on Eye Detection and Tracking CrossRefGoogle Scholar
  28. 28.
    Murphy-Chutorian, E., Trivedi, M.M.: Hyhope: Hybrid head orientation and position estimation for vision-based driver head tracking. In: Intelligent Vehicles Symposium, 2008 IEEE, pp. 512–517 (June 2008) CrossRefGoogle Scholar
  29. 29.
    Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 607–626 (2009) CrossRefGoogle Scholar
  30. 30.
    Nickel, K., Stiefelhagen, R.: Pointing gesture recognition based on 3d-tracking of face, hands and head orientation. In: Proceedings of the 5th International Conference on Multimodal Interfaces. ICMI ’03, pp. 140–146. ACM, New York (2003) CrossRefGoogle Scholar
  31. 31.
    Niyogi, S., Freeman, W.T.: Example-based head tracking. In: Automatic Face and Gesture Recognition, 1996. Proceedings of the Second International Conference on, pp. 374–378 (October 1996) CrossRefGoogle Scholar
  32. 32.
    Ohno, T.: One-point calibration gaze tracking method. In: Proceedings of the 2006 Symposium on Eye Tracking Research & Applications. ETRA ’06, p. 34. ACM, New York (2006) CrossRefGoogle Scholar
  33. 33.
    Ohno, T., Mukawa, N., Yoshikawa, A.: Freegaze: A gaze tracking system for everyday gaze interaction. In: Proceedings of the 2002 Symposium on Eye Tracking Research & Applications. ETRA ’02, pp. 125–132. ACM, New York (2002) CrossRefGoogle Scholar
  34. 34.
    Pavlovic, V.I., Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human–computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 677–695 (1997) CrossRefGoogle Scholar
  35. 35.
    Potamias, M., Athitsos, V.: Nearest neighbor search methods for handshape recognition. In: PETRA ’08: Proceedings of the 1st International Conference on PErvasive Technologies Related to Assistive Environments, pp. 1–8. ACM, New York (2008) CrossRefGoogle Scholar
  36. 36.
    Rehg, J.M.: Visual analysis of high DOF articulated objects with application to hand tracking. PhD thesis, Carnegie Mellon University (1995) Google Scholar
  37. 37.
    Shih, S.-W., Wu, Y.-T., Liu, J.: A calibration-free gaze tracking technique. In: Pattern Recognition, 2000. Proceedings. 15th International Conference on, vol. 4, pp. 201–204 (2000) CrossRefGoogle Scholar
  38. 38.
    Stenger, B., Thayananthan, A., Torr, P.H.S., Cipolla, R.: Model-based hand tracking using a hierarchical Bayesian filter. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1372–1384 (2006) CrossRefGoogle Scholar
  39. 39.
    Stern, H., Wachs, J., Edan, Y.: A method for selection of optimal hand gesture vocabularies. In: Sales Dias, M., Gibet, S., Wanderley, M., Bastos, R. (eds.) Gesture-Based Human–Computer Interaction and Simulation. Lecture Notes in Computer Science, vol. 5085, pp. 57–68. Springer, Berlin (2009) CrossRefGoogle Scholar
  40. 40.
    Suk, H.-I., Sin, B.-K., Lee, S.-W.: Recognizing hand gestures using dynamic Bayesian network. In: Automatic Face Gesture Recognition, 2008. FG ’08. 8th IEEE International Conference on, pp. 1–6 (17–19 September 2008) CrossRefGoogle Scholar
  41. 41.
    Tan, K.-H., Kriegman, D.J., Ahuja, N.: Appearance-based eye gaze estimation. In: Proceedings of the Sixth IEEE Workshop on Applications of Computer Vision. WACV ’02, Washington, DC, p. 191. IEEE Comput. Soc., Los Alamitos (2002) Google Scholar
  42. 42.
    Tomasi, C., Petrov, S., Sastry, A.: 3d tracking = classification + interpolation. In: Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, Nice, France, pp. 1441–1448 (13–16 October 2003) CrossRefGoogle Scholar
  43. 43.
    Tu, J., Tao, H., Huang, T.: Face as mouse through visual face tracking. Comput. Vis. Image Underst. 108(1–2), 35–40 (2007). Special Issue on Vision for Human–Computer Interaction CrossRefGoogle Scholar
  44. 44.
    Wang, J.-G., Sung, E., Venkateswarlu, R.: Estimating the eye gaze from one eye. Comput. Vis. Image Underst. 98, 83–103 (2005) CrossRefGoogle Scholar
  45. 45.
    White, Jr. K.P., Hutchinson, T.E., Carley, J.M.: Spatially dynamic calibration of an eye-tracking system. IEEE Trans. Syst. Man Cybern. 23(4), 1162–1168 (1993) CrossRefGoogle Scholar
  46. 46.
    Xiao, J., Baker, S., Matthews, I., Kanade, T.: Real-time combined 2d+3d active appearance models. In: Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, vol. 2, pp. 535–542 (June 27 – July 2 2004) CrossRefGoogle Scholar
  47. 47.
    Yang, M.-H., Ahuja, N.: Recognizing hand gesture using motion trajectories. In: Computer Vision and Pattern Recognition, p. 472 (1999) Google Scholar
  48. 48.
    Zhai, S., Morimoto, C., Ihde, S.: Manual and gaze input cascaded (magic) pointing. In: CHI ’99: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 246–253. ACM, New York (1999) Google Scholar
  49. 49.
    Zhou, H., Lin, D.J., Huang, T.S.: Static hand gesture recognition based on local orientation histogram feature distribution model. In: Computer Vision and Pattern Recognition Workshop, 2004 Conference on, p. 161 (2004) CrossRefGoogle Scholar
  50. 50.
    Zhu, Z., Ji, Q., Bennett, K.P.: Nonlinear eye gaze mapping function estimation via support vector regression. In: Pattern Recognition, 2006. ICPR 2006. 18th International Conference on, vol. 1, pp. 1132–1135 (2006) Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.Beckman InstituteUniversity of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations