Advertisement

Human Computing and Machine Understanding of Human Behavior: A Survey

  • Maja Pantic
  • Alex Pentland
  • Anton Nijholt
  • Thomas S. Huang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4451)

Abstract

A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affecti0ve and social signaling. This article discusses how far are we from enabling computers to understand human behavior.

Keywords

Human sensing Human Behavior Understanding Multimodal Data Analysis Affective Computing Socially-aware Computing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aarts, E.: Ambient intelligence drives open innovation. ACM Interactions 12(4), 66–68 (2005)CrossRefGoogle Scholar
  2. 2.
    Ambady, N., Rosenthal, R.: Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological Bulletin 111(2), 256–274 (1992)CrossRefGoogle Scholar
  3. 3.
    Ba, S.O., Odobez, J.M.: A probabilistic framework for joint head tracking and pose estimation. In: Proc. Conf. Pattern Recognition, vol. 4, pp. 264–267 (2004)Google Scholar
  4. 4.
    Bartlett, M.S., et al.: Fully automatic facial action recognition in spontaneous behavior. In: Proc. Conf. Face & Gesture Recognition, pp. 223–230 (2006)Google Scholar
  5. 5.
    Bicego, M., Cristani, M., Murino, V.: Unsupervised scene analysis: A hidden Markov model approach. Computer Vision & Image Understanding 102(1), 22–41 (2006)CrossRefGoogle Scholar
  6. 6.
    Bobick, A.F.: Movement, activity and action: The role of knowledge in the perception of motion. Philosophical Trans. Roy. Soc. London B 352(1358), 1257–1265 (1997)CrossRefGoogle Scholar
  7. 7.
    Bowyer, K.W., Chang, K., Flynn, P.: A survey of approaches and challenges in 3D and multimodal 3D+2D face recognition. Computer Vision & Image Understanding 101(1), 1–15 (2006)CrossRefGoogle Scholar
  8. 8.
    Brodal, A.: Neurological anatomy: In relation to clinical medicine. Oxford University Press, New York (1981)Google Scholar
  9. 9.
    Buxton, H.: Learning and understanding dynamic scene activity: a review. Image & Vision Computing 21(1), 125–136 (2003)CrossRefGoogle Scholar
  10. 10.
    Cacioppo, J.T., et al.: The psychophysiology of emotion. In: Lewis, M., Haviland-Jones, J.M. (eds.) Handbook of Emotions, pp. 173–191. The Guilford Press, New York (2000)Google Scholar
  11. 11.
    Cheung, K.M.G., Baker, S., Kanade, T.: Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. In: Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 77–84 (2003)Google Scholar
  12. 12.
    Chiang, C.C., Huang, C.J.: A robust method for detecting arbitrarily tilted human faces in color images. Pattern Recognition Letters 26(16), 2518–2536 (2005)CrossRefGoogle Scholar
  13. 13.
    Cohn, J.F.: Foundations of human computing: Facial expression and emotion. In: Proc. ACM Int’l Conf. Multimodal Interfaces, pp. 233–238 (2006)Google Scholar
  14. 14.
    Cohn, J.F., et al.: Automatic analysis and recognition of brow actions in spontaneous facial behavior. In: Proc. IEEE Int’l Conf. Systems, Man & Cybernetics, pp. 610–616 (2004)Google Scholar
  15. 15.
    Cohn, J.F., Schmidt, K.L.: The timing of facial motion in posed and spontaneous smiles. J. Wavelets, Multi-resolution & Information Processing 2(2), 121–132 (2004)CrossRefGoogle Scholar
  16. 16.
    Costa, M., et al.: Social presence, embarrassment, and nonverbal behavior. Journal of Nonverbal Behavior 25(4), 225–240 (2001)CrossRefGoogle Scholar
  17. 17.
    Coulson, M.: Attributing emotion to static body postures: Recognition accuracy, confusions, & viewpoint dependence. J. Nonverbal Behavior 28(2), 117–139 (2004)CrossRefMathSciNetGoogle Scholar
  18. 18.
    Cunningham, D.W., et al.: The components of conversational facial expressions. In: Proc. ACM Int’l Symposium on Applied Perception in Graphics and Visualization, pp. 143–149 (2004)Google Scholar
  19. 19.
    Deng, B.L., Huang, X.: Challenges in adopting speech recognition. Communications of the ACM 47(1), 69–75 (2004)CrossRefGoogle Scholar
  20. 20.
    Dey, A.K., Abowd, G.D., Salber, D.: A conceptual framework and a toolkit for supporting the rapid prototyping of context-aware applications. J. Human-Computer Interaction 16(2/4), 97–166 (2001)CrossRefGoogle Scholar
  21. 21.
    Dong, W., Pentland, A.: Modeling Influence between experts. In: Huang, T.S., et al. (eds.) ICMI/IJCAI Workshops 2007. LNCS (LNAI), vol. 4451, Springer, Heidelberg (2007)CrossRefGoogle Scholar
  22. 22.
    Duchowski, A.T.: A breadth-first survey of eye-tracking applications. Behavior Research Methods, Instruments and Computing 34(4), 455–470 (2002)Google Scholar
  23. 23.
    Ekman, P.: Darwin, deception, and facial expression. Annals New York Academy of sciences 1000, 205–221 (2003)CrossRefGoogle Scholar
  24. 24.
    Ekman, P., Friesen, W.F.: The repertoire of nonverbal behavioral categories – origins, usage, and coding. Semiotica 1, 49–98 (1969)Google Scholar
  25. 25.
    Ekman, P., Friesen, W.V., Hager, J.C.: Facial Action Coding System. A Human Face, Salt Lake City (2002)Google Scholar
  26. 26.
    Ekman, P., Rosenberg, E. (eds.): What the Face Reveals. Oxford University Press, Oxford (2005)Google Scholar
  27. 27.
    El Kaliouby, R., Robinson, P.: Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures. Proc. Int’l Conf. Computer Vision & Pattern Recognition 3, 154 (2004)Google Scholar
  28. 28.
    Fridlund, A.J.: The new ethology of human facial expression. In: Russell, J.A., Fernandez-Dols, J.M. (eds.) The psychology of facial expression, pp. 103–129. Cambridge University Press, Cambridge (1997)Google Scholar
  29. 29.
    Furnas, G., et al.: The vocabulary problem in human-system communication. Communications of the ACM 30(11), 964–972 (1987)CrossRefGoogle Scholar
  30. 30.
    Gatica-Perez, D., et al.: Detecting group interest level in meetings. In: Proc. Int’l Conf. Acoustics, Speech & Signal Processing, vol. 1, pp. 489–492 (2005)Google Scholar
  31. 31.
    Gibson, K.R., Ingold, T. (eds.): Tools, Language and Cognition in Human Evolution. Cambridge University Press, Cambridge (1993)Google Scholar
  32. 32.
    Gu, H., Ji, Q.: Information extraction from image sequences of real-world facial expressions. Machine Vision and Applications 16(2), 105–115 (2005)CrossRefGoogle Scholar
  33. 33.
    Gunes, H., Piccardi, M.: Affect Recognition from Face and Body: Early Fusion vs. Late Fusion. In: Proc. Int’l Conf. Systems, Man and Cybernetics, pp. 3437–3443 (2005)Google Scholar
  34. 34.
    Haykin, S., de Freitas, N.: Special Issue on Sequential State Estimation. Proceedings of the IEEE 92(3), 399–574 (2004)CrossRefGoogle Scholar
  35. 35.
    Heller, M., Haynal, V.: Depression and suicide faces. In: Ekman, P., Rosenberg, E. (eds.) What the Face Reveals, pp. 339–407. Oxford University Press, New York (1997)Google Scholar
  36. 36.
    Huang, K.S., Trivedi, M.M.: Robust real-time detection, tracking, and pose estimation of faces in video. In: Proc. Conf. Pattern Recognition, vol. 3, pp. 965–968 (2004)Google Scholar
  37. 37.
    Isard, M., Blake, A.: Condensation - conditional density propagation for visual tracking. J. Computer Vision 29(1), 5–28 (1998)CrossRefGoogle Scholar
  38. 38.
    Izard, C.E.: Emotions and facial expressions: A perspective from Differential Emotions Theory. In: Russell, J.A., Fernandez-Dols, J.M. (eds.) The psychology of facial expression, pp. 57–77. Cambridge University Press, Cambridge (1997)Google Scholar
  39. 39.
    Jain, A.K., Ross, A.: Multibiometric systems. Communications of the ACM 47(1), 34–40 (2004)CrossRefGoogle Scholar
  40. 40.
    Juslin, P.N., Scherer, K.R.: Vocal expression of affect. In: Harrigan, J., Rosenthal, R., Scherer, K. (eds.) The New Handbook of Methods in Nonverbal Behavior Research, Oxford University Press, Oxford (2005)Google Scholar
  41. 41.
    Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 82, 35–45 (1960)Google Scholar
  42. 42.
    Karpouzis, K., et al.: Modeling naturalistic affective states via facial, vocal, and bodily expressions recognition. In: Huang, T.S., et al. (eds.) ICMI/IJCAI Workshops 2007. LNCS (LNAI), vol. 4451, Springer, Heidelberg (2007)CrossRefGoogle Scholar
  43. 43.
    Keltner, D., Ekman, P.: Facial expression of emotion. In: Lewis, M., Haviland-Jones, J.M. (eds.) Handbook of Emotions, pp. 236–249. The Guilford Press, New York (2000)Google Scholar
  44. 44.
    Li, S.Z., Jain, A.K. (eds.): Handbook of Face Recognition. Springer, New York (2005)zbMATHGoogle Scholar
  45. 45.
    Lisetti, C.L., Schiano, D.J.: Automatic facial expression interpretation: Where human-computer interaction, AI and cognitive science intersect. Pragmatics and Cognition 8(1), 185–235 (2000)CrossRefGoogle Scholar
  46. 46.
    Maat, L., Pantic, M.: Gaze-X: Adaptive affective multimodal interface for single-user office scenarios. In: Proc. ACM Int’l Conf. Multimodal Interfaces, pp. 171–178 (2006)Google Scholar
  47. 47.
    Matos, S., et al.: Detection of cough signals in continuous audio recordings using HMM. IEEE Trans. Biomedical Engineering 53(6), 1078–1083 (2006)CrossRefGoogle Scholar
  48. 48.
    Nijholt, A., Rist, T., Tuinenbreijer, K.: Lost in ambient intelligence. In: Proc. Int’l Conf. Computer Human Interaction, pp. 1725–1726 (2004)Google Scholar
  49. 49.
    Nijholt, A., et al.: Social Interfaces for Ambient Intelligence Environments. In: Aarts, E., Encarnaçao, J. (eds.) True Visions: The Emergence of Ambient Intelligence, pp. 275–289. Springer, New York (2006)Google Scholar
  50. 50.
    Nijholt, A., Traum, D.: The Virtuality Continuum Revisited. In: Proc. Int’l Conf. Computer Human Interaction, pp. 2132–2133 (2005)Google Scholar
  51. 51.
    Nock, H.J., Iyengar, G., Neti, C.: Multimodal processing by finding common cause. Communications of the ACM 47(1), 51–56 (2004)CrossRefGoogle Scholar
  52. 52.
    Norman, D.A.: Human-centered design considered harmful. ACM Interactions 12(4), 14–19 (2005)CrossRefMathSciNetGoogle Scholar
  53. 53.
    Oikonomopoulos, A., et al.: Trajectory-based Representation of Human Actions. In: Huang, T.S., et al. (eds.) ICMI/IJCAI Workshops 2007. LNCS (LNAI), vol. 4451, Springer, Heidelberg (2007)CrossRefGoogle Scholar
  54. 54.
    Oudeyer, P.Y.: The production and recognition of emotions in speech: features and algorithms. Int’l J. Human-Computer Studies 59(1-2), 157–183 (2003)CrossRefGoogle Scholar
  55. 55.
    Oviatt, S.: User-centered modeling and evaluation of multimodal interfaces. Proceedings of the IEEE 91(9), 1457–1468 (2003)CrossRefGoogle Scholar
  56. 56.
    Pal, P., Iyer, A.N., Yantorno, R.E.: Emotion detection from infant facial expressions and cries. In: Proc. Int’l Conf. Acoustics, Speech & Signal Processing, vol. 2, pp. 721–724 (2006)Google Scholar
  57. 57.
    Pantic, M., Bartlett, M.S.: Machine Analysis of Facial Expressions. In: Kurihara, K. (ed.) Face Recognition. Advanced Robotics Systems, Vienna, Austria (2007)Google Scholar
  58. 58.
    Pantic, M., Patras, I.: Dynamics of Facial Expressions – Recognition of Facial Actions and their Temporal Segments from Face Profile Image Sequences. IEEE Trans. Systems, Man, and Cybernetics, Part B 36(2), 433–449 (2006)CrossRefGoogle Scholar
  59. 59.
    Pantic, M., Rothkrantz, L.J.M.: Toward an Affect-Sensitive Multimodal Human-Computer Interaction. Proceedings of the IEEE 91(9), 1370–1390 (2003)CrossRefGoogle Scholar
  60. 60.
    Pantic, M., et al.: Web-based database for facial expression analysis. In: Proc. IEEE Int’l Conf. Multimedia and Expo, pp. 317–321 (2005), http://www.mmifacedb.com
  61. 61.
    Patras, I., Pantic, M.: Particle filtering with factorized likelihoods for tracking facial features. In: Proc. IEEE Int’l Conf. Face and Gesture Recognition, pp. 97–102 (2004)Google Scholar
  62. 62.
    Pentland, A.: Socially aware computation and communication. IEEE Computer 38(3), 33–40 (2005)Google Scholar
  63. 63.
    Pitt, M.K., Shephard, N.: Filtering via simulation: auxiliary particle filtering. J. Amer. Stat. Assoc. 94, 590–599 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  64. 64.
    Prabhakar, S., et al.: Introduction to the Special Issue on Biometrics: Progress and Directions. IEEE Trans. Pattern Analysis and Machine Intelligence 29(4), 513–516 (2007)CrossRefGoogle Scholar
  65. 65.
    Rinn, W.E.: The neuropsychology of facial expression: A review of the neurological and psychological mechanisms for producing facial expressions. Psychological Bulletin 95(1), 52–77 (1984)CrossRefGoogle Scholar
  66. 66.
    Russell, J.A., Fernandez-Dols, J.M. (eds.): The psychology of facial expression. Cambridge University Press, Cambridge (1997)Google Scholar
  67. 67.
    Russell, J.A., Bachorowski, J.A., Fernandez-Dols, J.M.: Facial and Vocal Expressions of Emotion. Annual Review of Psychology 54, 329–349 (2003)CrossRefGoogle Scholar
  68. 68.
    Ruttkay, Z.M., Reidsma, D., Nijholt, A.: Human computing, virtual humans, and artificial imperfection. In: Proc. ACM Int’l Conf. Multimodal Interfaces, pp. 179–184 (2006)Google Scholar
  69. 69.
    Sand, P., Teller, S.: Particle Video: Long-Range Motion Estimation using Point Trajectories. In: Proc. Int’l Conf. Computer Vision and Pattern Recognition, pp. 2195–2202 (2006)Google Scholar
  70. 70.
    Scanlon, P., Reilly, R.B.: Feature analysis for automatic speech reading. In: Proc. Int’l Workshop Multimedia Signal Processing, pp. 625–630 (2001)Google Scholar
  71. 71.
    Sharma, R., et al.: Speech-gesture driven multimodal interfaces for crisis management. Proceedings of the IEEE 91(9), 1327–1354 (2003)CrossRefGoogle Scholar
  72. 72.
    Sim, T., et al.: Continuous Verification Using Multimodal Biometrics. IEEE Trans. Pattern Analysis and Machine Intelligence 29(4), 687–700 (2007)CrossRefGoogle Scholar
  73. 73.
    Song, M., et al.: Audio-visual based emotion recognition – A new approach. In: Proc. Int’l Conf. Computer Vision and Pattern Recognition, pp. 1020–1025 (2004)Google Scholar
  74. 74.
    Starner, T.: The Challenges of Wearable Computing. IEEE Micro 21(4), 44–67 (2001)CrossRefGoogle Scholar
  75. 75.
    Stein, B., Meredith, M.A.: The Merging of Senses. MIT Press, Cambridge (1993)Google Scholar
  76. 76.
    Stenger, B., Torr, P.H.S., Cipolla, R.: Model-based hand tracking using a hierarchical Bayesian filter. IEEE Trans. Pattern Analysis and Machine Intelligence 28(9), 1372–1384 (2006)CrossRefGoogle Scholar
  77. 77.
    Streitz, N., Nixon, P.: The Disappearing Computer. ACM Communications 48(3), 33–35 (2005)CrossRefGoogle Scholar
  78. 78.
    Tao, H., Huang, T.S.: Connected vibrations – a model analysis approach to non-rigid motion tracking. In: Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pp. 735–740 (1998)Google Scholar
  79. 79.
    Tian, Y.L., Kanade, T., Cohn, J.F.: Facial Expression Analysis. In: Li, S.Z., Jain, A.K. (eds.) Handbook of Face Recognition, pp. 247–276. Springer, New York (2005)CrossRefGoogle Scholar
  80. 80.
    Truong, K.P., van Leeuwen, D.A.: Automatic detection of laughter. In: Proc. Interspeech Euro. Conf., pp. 485–488 (2005)Google Scholar
  81. 81.
    Valstar, M.F., Pantic, M.: Biologically vs. logic inspired encoding of facial actions and emotions in video. In: Proc. IEEE Int’l Conf. on Multimedia and Expo, pp. 325–328 (2006)Google Scholar
  82. 82.
    Valstar, M.F., Pantic, M.: Fully automatic facial action unit detection and temporal analysis. In: Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, vol. 3, p. 149 (2006)Google Scholar
  83. 83.
    Valstar, M.F., et al.: Spontaneous vs. Posed Facial Behavior: Automatic Analysis of Brow Actions. In: Proc. ACM Int’l Conf. Multimodal Interfaces, pp. 162–170 (2006)Google Scholar
  84. 84.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int’l J. Computer Vision 57(2), 137–154 (2004)CrossRefGoogle Scholar
  85. 85.
    Wang, J.J., Singh, S.: Video analysis of human dynamics – a survey. Real Time Imaging 9(5), 321–346 (2003)CrossRefGoogle Scholar
  86. 86.
    Wang, L., Hu, W., Tan, T.: Recent developments in human motion analysis. Pattern Recognition 36(3), 585–601 (2003)CrossRefGoogle Scholar
  87. 87.
    Weiser, M.: The Computer for the Twenty-First Century. Scientific American 265(3), 94–104 (1991)CrossRefGoogle Scholar
  88. 88.
    Williams, A.C.: Facial expression of pain: An evolutionary account. Behavioral & Brain Sciences 25(4), 439–488 (2002)CrossRefGoogle Scholar
  89. 89.
    Yang, M.H., Kriegman, D.J., Ahuja, N.: Detecting faces in images: A survey. IEEE Trans. Pattern Analysis and Machine Intelligence 24(1), 34–58 (2002)CrossRefGoogle Scholar
  90. 90.
    Zhai, S., Bellotti, V.: Sensing-Based Interaction. ACM Trans. Computer-Human Interaction 12(1), 1–2 (2005)CrossRefGoogle Scholar
  91. 91.
    Zhao, W., et al.: Face recognition: A literature survey. ACM Computing Surveys 35(4), 399–458 (2003)CrossRefGoogle Scholar
  92. 92.
    Zeng, Z., et al.: Audio-visual Emotion Recognition in Adult Attachment Interview. In: Proc. ACM Int’l Conf. Multimodal Interfaces, pp. 139–145 (2006)Google Scholar
  93. 93.
    BTT Survey on Alternative Biometrics. Biometric Technology Today 14(3), 9–11 (2006)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Maja Pantic
    • 1
    • 3
  • Alex Pentland
    • 2
  • Anton Nijholt
    • 3
  • Thomas S. Huang
    • 4
  1. 1.Computing Dept., Imperial Collge London, LondonUK
  2. 2.Media Lab, Massachusetts Institute of TechnologyUSA
  3. 3.EEMCS, University of Twente, EnschedeThe Netherlands
  4. 4.Beckman Institute, University of Illinois at Urbana-ChampaignUSA

Personalised recommendations