Applications to Safe Human–Robot Interaction

  • Christian Wöhler
Part of the book series (XMEDIAPUBL)


In this chapter we address the scenario of safe human–robot interaction in the industrial production environment. For example, in car manufacturing, industrial production processes are characterised by either fully automatic production sequences carried out solely by industrial robots or fully manual assembly steps where only humans work together on the same task. Close collaboration between humans and industrial robots is very limited and usually not possible due to safety concerns. Industrial production processes may increase their efficiency by establishing a close collaboration of humans and machines exploiting their unique capabilities, which requires sophisticated techniques for human–robot interaction. In this context, the recognition of interactions between humans and industrial robots requires vision methods for three-dimensional pose estimation and tracking of the motion of human body parts based on three-dimensional scene analysis. We begin with an overview of gesture recognition methods in the general context of human–robot interaction and provide an overview of vision-based safe human–robot interaction. We evaluate the performance of the three-dimensional approach to the detection and tracking of objects in point clouds described in Chap.  1 in a typical industrial production environment. The introduced methods for three-dimensional detection, pose estimation, and tracking of human body parts and recognition of their actions are evaluated in similar scenarios.


Gesture Recognition Industrial Robot Test Person Ground Truth Data Robot Interaction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Baerveldt, A.-J., 1992. A safety system for close interaction between man and robot. Proc. IFAC Int. Conf. on Safety, Security and Reliability of Computers, Zurich, Switzerland. Google Scholar
  2. Barrois, B., 2010. Analyse der Position, Orientierung und Bewegung von rigiden und artikulierten Objekten aus Stereobildsequenzen. Doctoral Dissertation, Technical Faculty, Bielefeld University, Germany. Google Scholar
  3. Barrois, B., Wöhler, C., 2008. Spatio-temporal 3D pose estimation of objects in stereo images. In: Gasteratos, A., Vincze, M., Tsotsos, J. (eds.), Proc. Int. Conf. on Computer Vision Systems, Santorini, Greece. Lecture Notes in Computer Science 5008, pp. 507–516, Springer, Berlin. CrossRefGoogle Scholar
  4. Bauckhage, C., Hanheide, M., Wrede, S., Käster, T., Pfeiffer, M., Sagerer, G., 2005. Vision systems with the human in the loop. EURASIP J. Appl. Signal Process. 2005(14), pp. 2375–2390. MATHCrossRefGoogle Scholar
  5. Black, M. J., Jepson, A. D., 1998. A probabilistic framework for matching temporal trajectories: CONDENSATION-based recognition of gestures and expressions. Proc. Europ. Conf. on Computer Vision, LNCS 1406, pp. 909–924, Springer, Berlin. Google Scholar
  6. Blake, A., Isard, M., 1998. Active Contours. Springer, London. CrossRefGoogle Scholar
  7. Campbell, L. W., Becker, D. A., Azarbayejani, A., Bobick, A., Pentland, A., 1996. Invariant features for 3-D gesture recognition. Proc. Int. Workshop on Face and Gesture Recognition, Killington, USA, pp. 157–162. Google Scholar
  8. d’Angelo, P., Wöhler, C., Krüger, L., 2004. Model based multi-view active contours for quality inspection. Proc. Int. Conf. on Computer Vision and Graphics, Warszaw, Poland. Google Scholar
  9. Ebert, D., Henrich, D., 2003. SIMERO: Sichere Mensch-Roboter-Koexistenz. Proc. Workshop für OTS-Systeme in der Robotik – Mensch und Roboter ohne trennende Schutzsysteme, Stuttgart, Germany, pp. 119–134. Google Scholar
  10. Fischer, M., Henrich, D., 2009. Surveillance of robots using multiple colour or depth cameras with distributed processing. Proc. ACM/IEEE Int. Conf. on Distributed Smart Cameras. Google Scholar
  11. Franke, U., Joos, A., 2000. Real-time stereo vision for urban traffic scene understanding. Proc. IEEE Conf. on Intelligent Vehicles, Detroit, pp. 273–278. Google Scholar
  12. Fritsch, J., Hofemann, N., Sagerer, G., 2004. Combining sensory and symbolic data for manipulative gesture recognition. Proc. Int. Conf. on Pattern Recognition, Cambridge, UK, vol. 3, pp. 930–933. Google Scholar
  13. Fusiello, A., Trucco, E., Verri, A., 2000. A compact algorithm for rectification of stereo pairs. Mach. Vis. Appl. 12, pp. 16–22. CrossRefGoogle Scholar
  14. Gall, J., Rosenhahn, B., Brox, T., Seidel, H.-P., 2009. Optimization and filtering for human motion capture—a multi-layer framework. Int. J. Comput. Vis. 87(1–2), pp. 75–92. Google Scholar
  15. Gecks, T., Henrich, D., 2005. Human–robot cooperation: safe pick-and-place operations. Proc. IEEE Int. Workshop on Robot and Human Interactive Communication, Nashville, USA. Google Scholar
  16. Groß, H.-M., Richarz, J., Mueller, S., Scheidig, A., Martin, C., 2006. Probabilistic multi-modal people tracker and monocular pointing pose estimator for visual instruction of mobile robot assistants. Proc. IEEE World Congress on Computational Intelligence and Int. Conf. on Neural Networks, pp. 8325–8333. Google Scholar
  17. Hahn, M., 2011. Raum-zeitliche Objekt- und Aktionserkennung: Ein statistischer Ansatz für reale Umgebungen. Doctoral Dissertation, Technical Faculty, Bielefeld University, Germany. Google Scholar
  18. Hahn, M., Barrois, B., Krüger, L., Wöhler, C., Sagerer, G., Kummert, F., 2010a. 3D pose estimation and motion analysis of the articulated human hand-forearm limb in an industrial production environment. 3D Research 03, 03. Google Scholar
  19. Hahn, M., Krüger, L., Wöhler, C., Groß, H.-M., 2007. Tracking of human body parts using the multiocular contracting curve density algorithm. Proc. Int. Conf. on 3-D Digital Imaging and Modeling, Montréal, Canada. Google Scholar
  20. Hahn, M., Krüger, L., Wöhler, C., 2008a. 3D action recognition and long-term prediction of human motion. In: Gasteratos, A., Vincze, M., Tsotsos, J. (eds.), Proc. Int. Conf. on Computer Vision Systems, Santorini, Greece. Lecture Notes in Computer Science 5008, pp. 23–32, Springer, Berlin. CrossRefGoogle Scholar
  21. Hahn, M., Krüger, L., Wöhler, C., Kummert, F., 2009. 3D action recognition in an industrial environment. In: Ritter, H., Sagerer, G., Dillmann, R., Buss, M. (eds.), Proc. 3rd Int. Workshop on Human-Centered Robot Systems, Bielefeld, Germany. Cognitive Systems Monographs 6, pp. 141–150, Springer, Berlin. Google Scholar
  22. Hahn, M., Quronfuleh, F., Wöhler, C., Kummert, F., 2010b. 3D mean-shift tracking and recognition of working actions. In: Salah, A. A., Gevers, T., Sebe, N., Vinciarelli, A. (eds.), Proc. Int. Workshop on Human Behaviour Understanding, held in conjunction with ICPR 2010, Istanbul, Turkey. Lecture Notes on Computer Science 6219, pp. 101–112, Springer, Berlin. CrossRefGoogle Scholar
  23. Hanek, R., 2004. Fitting Parametric Curve Models to Images Using Local Self-adapting Separation Criteria. Doctoral Dissertation, Technical University of Munich. Google Scholar
  24. Henrich, D., Fischer, M., Gecks, T., Kuhn, S., 2008. Sichere Mensch/Roboter-Koexistenz und Kooperation. Proc. Robotik 2008, München, Germany. Google Scholar
  25. Henrich, D., Gecks, T., 2008. Multi-camera collision detection between known and unknown objects. Proc. ACM/IEEE International Conference on Distributed Smart Cameras. Google Scholar
  26. Hofemann, N., 2007. Videobasierte Handlungserkennung für die natürliche Mensch-Maschine-Interaktion. Doctoral Dissertation, Technical Faculty, Bielefeld University, Germany. Google Scholar
  27. Hofmann, M., Gavrila, D. M., 2009. Multi-view 3D human pose estimation combining single-frame recovery, temporal integration and model adaptation. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2214–2221. Google Scholar
  28. Huguet, F., Devernay, F., 2007. A variational method for scene flow estimation from stereo sequences. Proc. Int. Conf. on Computer Vision, pp. 1–7. Google Scholar
  29. Krüger, L., Wöhler, C., Würz-Wessel, A., Stein, F., 2004. In-factory calibration of multiocular camera systems. Proc. SPIE Photonics Europe (Optical Metrology in Production Engineering), Strasbourg, pp. 126–137. Google Scholar
  30. Krüger, L., Wöhler, C., 2011. Accurate chequerboard corner localisation for camera calibration. Pattern Recognit. Lett. 32, pp. 1428–1435. CrossRefGoogle Scholar
  31. Kuhn, S., Gecks, T., Henrich, D., 2006. Velocity control for safe robot guidance based on fused vision and force/torque data. Proc. IEEE Conf. on Multisensor Fusion and Integration for Intelligent Systems, Heidelberg, Germany. Google Scholar
  32. Li, Z., Fritsch, J., Wachsmuth, S., Sagerer, G., 2006. An object-oriented approach using a top-down and bottom-up process for manipulative action recognition. In: Franke, K., Müller, K.-R., Nickolay, B., Schäfer, R. (eds.), Pattern Recognition, Proc. 28th DAGM Symposium, Heidelberg, Germany. Lecture Notes in Computer Science 4174, pp. 212–221, Springer, Berlin. Google Scholar
  33. Mündermann, L., Corazza, S., Andriacchi, T. P., 2008. Markerless motion capture for biomechanical applications. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds.), Human Motion: Understanding, Modelling, Capture and Animation, Springer, Dordrecht. Google Scholar
  34. Nehaniv, C. L., 2005: Classifying types of gesture and inferring intent. Proc. Symp. on Robot Companions: Hard Problems and Open Challenges in Robot–Human Interaction, pp. 74–81. The Society for the Study of Artificial Intelligence and the Simulation of Behaviour. Google Scholar
  35. Nickel, K., Seemann, E., Stiefelhagen, R., 2004. 3D-tracking of head and hands for pointing gesture recognition in a human–robot interaction scenario. Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, Seoul, Korea, pp. 565–570. CrossRefGoogle Scholar
  36. Nickel, K., Stiefelhagen, R., 2004. Real-time person tracking and pointing gesture recognition for human–robot interaction. Proc. Europ. Conf. on Computer Vision, Workshop on HCI, Prague, Czech Republic. Lecture Notes in Computer Science 3058, pp. 28–38, Springer, Berlin. Google Scholar
  37. Pavlovic, V., Sharma, R., Huang, T. S., 1997. Visual interpretation of hand gestures for human–computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), pp. 677–695. CrossRefGoogle Scholar
  38. Poppe, R., 2010. A survey on vision-based human action recognition. Image Vis. Comput. 28, pp. 976–990. CrossRefGoogle Scholar
  39. Richarz, J., Fink, G. A., 2001. Visual recognition of 3D emblematic gestures in an HMM framework. J. Ambient Intell. Smart Environ. 3(3), pp. 193–211. Thematic Issue on Computer Vision for Ambient Intelligence. Google Scholar
  40. Rosenhahn, B., Kersting, U., Smith, A., Gurney, J., Brox, T., Klette, R., 2005. A system for marker-less human motion estimation. In: Kropatsch, W., Sablatnig, R., Hanbury, A. (eds.), Pattern Recognition, Proc. 27th DAGM Symposium, Vienna, Austria. Lecture Notes in Computer Science 3663, pp. 230–237, Springer, Berlin. Google Scholar
  41. Rosenhahn, B., Kersting, U. G., Powell, K., Brox, T., Seidel, H.-P., 2008a. Tracking Clothed People. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds.), Human Motion: Understanding, Modelling, Capture and Animation, Springer, Dordrecht. Google Scholar
  42. Rosenhahn, B., Schmaltz, C., Brox, T., Weickert, J., Cremers, D., Seidel, H.-P., 2008b. Markerless motion capture of man–machine interaction. Proc. IEEE Conf. on Computer Vision and Pattern Recognition. Google Scholar
  43. Schmidt, J., 2009. Monokulare Modellbasierte Posturschätzung des Menschlichen Oberkörpers. Proc. Oldenburger 3D-Tage, Oldenburg, Germany, pp. 270–280. Google Scholar
  44. Schmidt, J., Fritsch, J., Kwolek, B., 2006. Kernel particle filter for real-time 3d body tracking in monocular color images. Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, pp. 567–572. Google Scholar
  45. Schmidt, J., Wöhler, C., Krüger, L., Gövert, T., Hermes, C., 2007. 3D scene segmentation and object tracking in multiocular image sequences. Proc. Int. Conf. on Computer Vision Systems, Bielefeld, Germany. Google Scholar
  46. Schweitzer, G., 1993. High-performance applications: robot motions in complex environments. Control Eng. Pract. 1(3), pp. 499–504. CrossRefGoogle Scholar
  47. Sigal, L., Black, M. J., 2006. Human Eva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion. Technical Report CS-06-08, Brown University. Google Scholar
  48. Turk, M., 2005. Multimodal human computer interaction. In: Kisacanin, B., Pavlovic, V., Huang, T. S. (eds.), Real-Time Vision for Human–Computer Interaction, Springer, Berlin, pp. 269–283. CrossRefGoogle Scholar
  49. Viola, P. A., Jones, M. J., 2001. Rapid object detection using a boosted cascade of simple features. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 511–518. Google Scholar
  50. Vischer, D., 1992. Cooperating robot with visual and tactile skills. Proc. IEEE Int. Conf. on Robotics md Automation, pp. 2018–2025. CrossRefGoogle Scholar
  51. Wachsmuth, S., Wrede, S., Hanheide, M., Bauckhage, C., 2005. An active memory model for cognitive computer vision systems. KI Journal 19(2), pp. 25–31. Special Issue on Cognitive Systems. Google Scholar
  52. Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., Cremers, D., 2008a. Efficient dense scene flow from sparse or dense stereo data. Proc. Europ. Conf. on Computer Vision, pp. 739–751. Google Scholar
  53. Wedel, A., Brox, T., Vaudrey, T., Rabe, C., Franke, U., Cremers, D., 2011. Stereoscopic scene flow computation for 3D motion understanding. Int. J. Comput. Vis. 95, pp. 29–51. MATHCrossRefGoogle Scholar
  54. Winkler, K. (ed.), 2006. Three Eyes Are Better than Two. SafetyEYE uses technical image processing to protect people at their workplaces. DaimlerChrysler Hightech Report 12/2006, DaimlerChrysler AG Communications, Stuttgart, Germany. Google Scholar
  55. Ziegler, J., Nickel, K., Stiefelhagen, R., 2006. Tracking of the articulated upper body on multi-view stereo image sequences. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 1, pp. 774–781. Google Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  • Christian Wöhler
    • 1
  1. 1.Department of Electrical Engineering and ITTechnical University of DortmundDortmundGermany

Personalised recommendations