Human Body Pose Interpretation and Classification for Social Human-Robot Interaction

Original Paper


A novel breed of robots known as socially assistive robots is emerging. These robots are capable of providing assistance to individuals through social and cognitive interaction. However, there are a number of research issues that need to be addressed in order to design such robots. In this paper, we address one main challenge in the development of intelligent socially assistive robots: The robot’s ability to identify human non-verbal communication during assistive interactions. In particular, we present a unique non-contact and non-restricting automated sensor-based approach for identification and categorization of human upper body language in determining how accessible a person is to the robot during natural real-time human-robot interaction (HRI). This classification will allow a robot to effectively determine its own reactive task-driven behavior during assistive interactions. Human body language is an important aspect of communicative nonverbal behavior. Body pose and position can play a vital role in conveying human intent, moods, attitudes and affect. Preliminary experiments show the potential of integrating the proposed body language recognition and classification technique into socially assistive robotic systems partaking in HRI scenarios.


Socially assistive robots Non-verbal communication 3D human body pose detection and classification Human-robot interaction 


  1. 1.
    Lee KW, Kim HR, Yoon WC, Yoon YS, Kwon DS (2005) Designing a human-robot interaction framework for home service robot. In: IEEE international workshop on robot and human interactive communication, pp 286–293 Google Scholar
  2. 2.
    Heerink M, Kröse B, Wielinga B, Evers V (2006) Human-robot user studies in eldercare: lessons learned. In: Proceedings of international conference on smart homes and health telematics (ICOST), pp 31–38 Google Scholar
  3. 3.
    Tapus A, Tapus C, Mataric MJ (2007) Hands-off therapist robot behavior adaptation to user personality for post-stroke rehabilitation therapy. In: IEEE international conference on robotics and automation, pp 1547–1553 Google Scholar
  4. 4.
    Montemerlo M, Pineau J, Roy N, Thrun S, Verma V (2002) Experiences with a mobile robotic guide for the elderly. In: National conference on artificial intelligence, pp 587–592 Google Scholar
  5. 5.
    Michaud F et al (2006) Socially interactive robots for real life use. In: American association for artificial intelligence (AAAI) workshop, pp 45–52 Google Scholar
  6. 6.
    Stiehl WD, Lieberman J, Breazeal C, Basel L, Lalla L, Wolf M (2006) The design of the Huggable: A therapeutic robotic companion for relational affective touch. In: American association for artificial intelligence (AAAI) fall symposium on caring machines: AI in eldercare, pp 91–98 Google Scholar
  7. 7.
    Heerink M, Krose B, Evers V, Wielinga B (2006) Studying the acceptance of a robotic agent by elderly users. In: Proceedings of the IEEE international symposium on robot and human interactive communication (RO-MAN), pp 1–11 Google Scholar
  8. 8.
    Kozima H, Michalowski MP, Nakagawa C (2009) A playful robot for research therapy, and entertainment. Int J Soc Robot 1(1):3–18 CrossRefGoogle Scholar
  9. 9.
    Saldien J, Goris K, Vanderborght B, Vanderfaeillie J, Lefeber D (2010) Expressing emotions with the social robot probo. Int J Soc Robot 2(4):377–389 CrossRefGoogle Scholar
  10. 10.
    Blow M, Dautenhahn K, Appleby A, Nehaniv CL, Lee DC (2005) Perception of robot smiles and dimensions for human-robot interaction design. In: International conference on rehabilitation robotics, pp 337–340 Google Scholar
  11. 11.
    Tapus A, Tapus C, Mataric MJ (2009) The use of socially assistive robots in the design of intelligent cognitive therapies for people with dementia. In: Proceedings of the international conference on rehabilitation robotics (ICORR), pp 1–6 Google Scholar
  12. 12.
    Kang KI, Freedman S, Mataric MJ, Cunningham MJ, Lopez B (2005) A hands-off physical therapy assistance robot for cardiac patients. In: IEEE international conference on rehabilitation robotics, pp 337–340 CrossRefGoogle Scholar
  13. 13.
    Sato K, Ishii M, Madokoro H (2003) Testing and evaluation of a patrol system for hospitals. Electron Commun Jpn 86(12):14–26 CrossRefGoogle Scholar
  14. 14.
    Lopez M, Barea R, Bergasa L, Escudero M (2004) A human robot cooperative learning system for easy installation of assistant robot in new working environment. J Intell Robot Syst 40(3):233–265 CrossRefGoogle Scholar
  15. 15.
    Eriksson J, Mataric MJ, Winstein CJ (2005) Hands-off assistive robotics for post-stroke arm rehabilitation. In: 9th international conference on rehabilitation robotics, pp 21–24 CrossRefGoogle Scholar
  16. 16.
    Nejat G, Ficocelli M (2008) Can I be of assistance? The intelligence behind an assistive robot. In: IEEE int conference on robotics and automation (ICRA), pp 3564–3569 CrossRefGoogle Scholar
  17. 17.
    Terao J, Trejos L, Zhang Z, Nejat G (2008) The design of an intelligent socially assistive robot for elderly care. In: ASME international mechanical engineering congress and exposition (IMECE), IMECE 2008–67678 Google Scholar
  18. 18.
    Nejat G, Allison B, Gomez N, Rosenfeld A (2007) The design of an interactive socially assistive robot for patient care. In: ASME international mechanical engineering congress and exposition (IMECE), IMECE 2007–41811 Google Scholar
  19. 19.
    Allison B, Nejat G, Kao E (2009) The design of an expressive human-like socially assistive robot. J Mech Robot 1(1):1–8 Google Scholar
  20. 20.
    Gong S, McOwan PW, Shan C (2007) Beyond facial expressions: learning human emotion from body gestures. In: Proceedings of British machine vision conference, pp 1–10 Google Scholar
  21. 21.
    Frintrop S, Konigs A, Hoeller F, Schulz D (2010) A component based approach to visual person tracking from a mobile platform. Int J Soc Robot 2(1):53–62 CrossRefGoogle Scholar
  22. 22.
    Benezeth Y, Emile B, Laurent H, Rosenberger C (2010) Vision-based system for human detection and tracking in indoor environment. Int J Soc Robot 2(1):41–52 CrossRefGoogle Scholar
  23. 23.
    Juang CF, Chang CM, Wu JR, Lee D (2009) Computer vision–based human body segmentation and posture estimation. IEEE Trans Syst Man Cybern, Part A, Syst Hum 39(1):119–133 CrossRefGoogle Scholar
  24. 24.
    Mori G, Ren X, Efros A, Malik J (2004) Recovering human body configurations: combining segmentation and recognition. In: IEEE conference on computer vision and pattern recognition, vol 2(2), pp 326–333 Google Scholar
  25. 25.
    Chen B, Nguyen N, Mori G (2008) Human pose estimation with rotated geometric blur. In: Workshop on applications of computer vision, pp 1–6 CrossRefGoogle Scholar
  26. 26.
    Pham QC, Gond L, Begard J, Allezard N, Sayd P (2007) Real-time posture analysis in a crowd using thermal imaging. In: IEEE conference on computer vision and pattern recognition, pp 1–8 CrossRefGoogle Scholar
  27. 27.
    Holte MB, Moeslund TB, Fihl P (2008) View invariant gesture recognition using the CSEM SwissRanger SR-2 Camera. Int J Intell Syst Technol Appl 5(3):295–303 Google Scholar
  28. 28.
    Demirdjian D, Varri C (2009) Driver pose estimation with 3D time-of-flight sensor. In: IEEE workshop on computational intelligence in vehicles and vehicular systems, pp 1–7 Google Scholar
  29. 29.
    Kohli P, Rihan J, Bray M, Torr PHS (2008) Simultaneous segmentation and pose estimation of humans using dynamic graph cuts. Int J Comput Vis 79(3):285–298 CrossRefGoogle Scholar
  30. 30.
    Gupta A, Mittal A, Davis L (2008) Constraint integration for efficient multiview pose estimation with self-occlusions. IEEE Trans Pattern Anal Mach Intell 30(3):493–506 CrossRefGoogle Scholar
  31. 31.
    Van den Bergh M, Koller-Meier E, Van Gool L (2009) Real-time body pose recognition using 2D or 3D haarlets. Int J Comput Vis 83(1):72–84 CrossRefGoogle Scholar
  32. 32.
    Cheng S, Park S, Trivedi M (2005) Multiperspective thermal IR and video arrays for 3D body tracking and driver activity analysis. In: IEEE international workshop on object tracking and classification in and beyond the visible spectrum and IEEE CVPR, pp 1–8 Google Scholar
  33. 33.
    Knoop S, Vacek S, Dillmann R (2007) A human body model for articulated 3D pose tracking. In: Pina Filho AC (ed) Humanoid robots, new developments, advanced robotic systems international, Croatia, pp 505–520 Google Scholar
  34. 34.
    Microsoft, Kinect, Available HTTP:
  35. 35.
    Shimizu M, Yoshizuka T, Miyamoto H (2006) A gesture recognition system using stereo vision and arm model fitting. In: The 3rd international conference on brain-inspired information technology, pp 89–92 Google Scholar
  36. 36.
    Hasanuzzaman M, Zhang T, Amporanamveth V, Bhuiyan MA, Shirai Y, Ueno H (2006) Gesture based human-robot interaction using a knowledge-based software platform. Ind Rob 33(1):37–49 CrossRefGoogle Scholar
  37. 37.
    Hasanuzzaman M, Amporanamveth V, Zhang T, Bhuiyan MA, Shirai Y, Ueno H (2004) Real-time vision-based gesture recognition for human robot interaction. In: IEEE international conference on robotics and biomimetics, pp 413–418 CrossRefGoogle Scholar
  38. 38.
    Park H, Kim E, Jang S, Park S, Park M, Kim H (2005) HMM-based gesture recognition for robot control. Pattern Recognit Image Anal 3522:607–614 CrossRefGoogle Scholar
  39. 39.
    Bonato V, Sanches AK, Fernandes MM, Cardoso JMP, Simoes EDV, Marques E (2004) A real time gesture recognition system for mobile robots. In: International conference on informatics in control, automation and robotics, pp 207–214 Google Scholar
  40. 40.
    Waldherr S, Thrun S, Romero R (2000) A gesture-based interface for human-robot interaction. Auton Robots 9(2):151–173 CrossRefGoogle Scholar
  41. 41.
    Bahadori S, Locchi L, Nardi D, Settembre GP (2005) Stereo vision based human body detection from a localized mobile robot. In: IEEE conference on advanced video and signal based surveillance, pp 499–504 CrossRefGoogle Scholar
  42. 42.
    Burger B, Ferrane I, Lerasle F (2008) Multimodal interaction abilities for a robot companion. In: International conference on computer vision systems, pp 549–558 CrossRefGoogle Scholar
  43. 43.
    Guan F, Li LY, Ge SS, Loh AP (2007) Robust human detection and identification by using stereo and thermal images in human robot interaction. Int J Inf Acquis 4(2):1–22 Google Scholar
  44. 44.
    Werghi N (2007) Segmentation and modeling of full human body shape from 3-D scan data: a survey. IEEE Trans Syst Man Cybern, Part C, Appl Rev 37(6):1122–1136 CrossRefGoogle Scholar
  45. 45.
    Moeslund TB, Granum E (2001) A survey of computer vision-based human motion capture. Comput Vis Image Underst 81(1):231–268 MATHCrossRefGoogle Scholar
  46. 46.
    Moeslund TB, Hilton A, Krüger V (2006) A survey of advances in vision-based human motion capture and analysis. Comput Vis Image Underst 104(1):90–126 CrossRefGoogle Scholar
  47. 47.
    Gross JJ, Thompson RA (2007) Emotion regulation: conceptual foundations. Handbook of emotion regulation. Guilford, New York Google Scholar
  48. 48.
    Cohen I, Garg A, Huang TS (2000) Emotion recognition from facial expressions using multilevel HMM. Neural information processing systems Google Scholar
  49. 49.
    Madsen M, el Kaliouby R, Goodwin M, Picard RW (2008) Technology for just-in-time in-situ learning of facial affect for persons diagnosed with autism spectrum. In: Proceedings of the 10th ACM conference on computers and accessibility (ASSETS), pp 1–7 Google Scholar
  50. 50.
    Duthoit CJ, Sztynda T, Lal SKL, Jap BT, Agbinya JI (2008) Optical flow image analysis of facial expressions of human emotion—forensic applications. In: Proceedings of the 1st international conference on forensic applications and techniques in telecommunications, information, and multimedia and workshop, pp 1–6 Google Scholar
  51. 51.
    Dailey MN, Cottrell GW, Padgett C (2002) EMPATH: a neural network that categorizes facial expressions. J Cogn Neurosci 14(8):1158–1173 CrossRefGoogle Scholar
  52. 52.
    Lisetti CL, Marpaung A (2007) Affective cognitive modeling for autonomous agents based on Scherer’s emotion theory. Adv Artif Intell 4313:19–32 Google Scholar
  53. 53.
    Tian YL, Kanade T, Cohn JF (2005) Facial expression analysis. In: Li SZ, Jain AK (eds) Handbook of face recognition. Springer, New York, pp 247–276 CrossRefGoogle Scholar
  54. 54.
    Kessous L, Castellano G, Caridakis G (2010) Multimodal emotion recognition in speech-based interaction using facial expression, body gestures and acoustic analysis. J Multimod User Interfac 3(1):33–48 CrossRefGoogle Scholar
  55. 55.
    Gunes H, Piccardi M (2005) Affect recognition from face and body: early fusion versus late fusion. In: Proceedings of the IEEE international conference on systems, man, and cybernetics (SMC’05), pp 3437–3443 Google Scholar
  56. 56.
    Schindler K, Van Gool L, de Gelder B (2008) Recognizing emotions expressed by body pose: a biologically inspired neural model. Neural Netw 21(9):1238–1246 CrossRefGoogle Scholar
  57. 57.
    Balomenos T, Raouzaiou A, Ioannou S, Drosopoulos A, Karpouzis K, Kollias S (2005) Emotion analysis in man-machine interaction systems, machine learning for multimodal interaction. In: Bengio S, Bourlard H (eds) Lecture notes in computer science, vol 3361. Springer, Berlin, pp 318–328 Google Scholar
  58. 58.
    Gunes H, Piccardi M (2005) Fusing face and body display for bi-modal emotion recognition: single frame analysis and multi-frame post integration. In: 1st international conference on affective computing and intelligent interaction (ACII’2005). Springer, Berlin, pp 102–110 CrossRefGoogle Scholar
  59. 59.
    Valstar MF, Gunes H, Pantic M (2007) How to distinguish posed from spontaneous smiles using geometric features. In: Proceedings of the ninth ACM international conference on multimodal interfaces (ICMI’07), pp 38–45 CrossRefGoogle Scholar
  60. 60.
    Abbasi AR, Dailey MN, Afzulpurka NV, Uno T (2010) Student mental state inference from unitentional body gestures using dynamic Bayesian networks. J Multimod User Interfac 3(1):21–31 CrossRefGoogle Scholar
  61. 61.
    Kapoor A, Picard R (2005) Multimodal affect recognition in learning environments. In: Proceedings of the ACM international conference on multimedia, pp 1–6 Google Scholar
  62. 62.
    Castellano G, Leite I, Pereira A, Martinho C, Paiva A, McOwan PW (2010) Affect recognition for interactive companions: challenges and design in real world scenarios. J Multimod User Interfac 3(1):89–98 CrossRefGoogle Scholar
  63. 63.
    De Silva PR, Osano M, Marasinghe A, Madurapperuma AP (2006) Towards recognizing emotion with affective dimensions through body gestures. In: Proceedings of the 7th international conference on automatic face and gesture recognition, pp 269–274 CrossRefGoogle Scholar
  64. 64.
    Castellano G, Villalba SD, Camurri A (2007) Recognizing human emotions from body movement and gesture dynamics. In: Affective computing and intelligent interaction, pp 71–82 CrossRefGoogle Scholar
  65. 65.
    Davis M (1997) Guide to movement analysis methods. Technical manuscript Google Scholar
  66. 66.
    Medioni G, François ARJ, Siddiqui M, Kim K, Yoon H (2007) Robust real-time vision for a personal service robot. Comput Vis Image Underst Arch 108(1–2):196–203 CrossRefGoogle Scholar
  67. 67.
    CSEM, SwissRanger SR3000, Available HTTP:
  68. 68.
    Thermoteknix Systems Ltd, Available HTTP:
  69. 69.
    JAI Industrial CCD/CMOS cameras, Available HTTP:
  70. 70.
    Davis M, Hadiks D (1990) Nonverbal behavior and client state changes during psychotherapy. J Clin Psychol 46(3):340–350 CrossRefGoogle Scholar
  71. 71.
    Davis M, Hadiks D (1994) Non-verbal aspects of therapist attunement. J Clin Psychol 50(2):425–438 Google Scholar
  72. 72.
    Cheung GKM, Kanade T, Bouguet JY, Holler M (2000) A real time system for robust 3D voxel reconstruction of human motions. In: IEEE conference on computer vision and pattern recognition, vol 2, pp 714–720 Google Scholar
  73. 73.
    Cai Q, Mitiche A, Aggarwal JK (1995) Tracking human motion in an indoor environment. In: International conference on image processing, pp 1–4 Google Scholar
  74. 74.
    Haritaoglu I, Harwood D, Davis LS (1998) Ghost: a human body part labeling system using silhouettes. In: International conference on pattern recognition, pp 1–6 Google Scholar
  75. 75.
    Hua G, Yang MH, Wu Y (2005) Learning to estimate human pose with data driven belief propagation. In: Proceedings in the conference on computer vision and pattern recognition, San Diego, California, USA, pp 1–8 Google Scholar
  76. 76.
    Sanders M, McCormick E (1993) Human factors in engineering and design, 7th edn. McGraw-Hill, New York Google Scholar
  77. 77.
    Marras WS, Kim JY (1993) Anthropometry of industrial populations. J Ergon 36(4):371–378 CrossRefGoogle Scholar
  78. 78.
    Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174 MathSciNetMATHCrossRefGoogle Scholar
  79. 79.
    Hall ET (1966) The hidden dimension. Doubleday, New York Google Scholar
  80. 80.
    Weingarten J (2006) Feature-based 3D SLAM, PhD thesis, EPFL Google Scholar
  81. 81.
    Matlab Calibration Toolbox, Available HTTP:

Copyright information

© Springer Science & Business Media BV 2011

Authors and Affiliations

  1. 1.Autonomous Systems and Biomechatronics Laboratory in the Department of Mechanical and Industrial EngineeringUniversity of TorontoTorontoCanada
  2. 2.Autonomous Systems Laboratory, Department of Mechanical EngineeringState University of New York at Stony BrookStony BrookUSA
  3. 3.Department of Mechanical and Industrial EngineeringUniversity of TorontoTorontoCanada
  4. 4.State University of New York at Stony BrookStony BrookUSA

Personalised recommendations