3D Research

, 1:3 | Cite as

3D pose estimation and motion analysis of the articulated human hand-forearm limb in an industrial production environment

  • Markus Hahn
  • Björn Barrois
  • Lars Krüger
  • Christian Wöhler
  • Gerhard Sagerer
  • Franz Kummert
3DR Review


This study introduces an approach to model-based 3D pose estimation and instantaneous motion analysis of the human hand-forearm limb in the application context of safe human-robot interaction. 3D pose estimation is performed using two approaches: The Multiocular Contracting Curve Density (MOCCD) algorithm is a top-down technique based on pixel statistics around a contour model projected into the images from several cameras. The Iterative Closest Point (ICP) algorithm is a bottom-up approach which uses a motion-attributed 3D point cloud to estimate the object pose. Due to their orthogonal properties, a fusion of these algorithms is shown to be favorable. The fusion is performed by a weighted combination of the extracted pose parameters in an iterative manner. The analysis of object motion is based on the pose estimation result and the motion-attributed 3D points belonging to the hand-forearm limb using an extended constraint-line approach which does not rely on any temporal filtering. A further refinement is obtained using the Shape Flow algorithm, a temporal extension of the MOCCD approach, which estimates the temporal pose derivative based on the current and the two preceding images, corresponding to temporal filtering with a short response time of two or at most three frames. Combining the results of the two motion estimation stages provides information about the instantaneous motion properties of the object.

Experimental investigations are performed on real-world image sequences displaying several test persons performing different working actions typically occurring in an industrial production scenario. In all example scenes, the background is cluttered, and the test persons wear various kinds of clothes. For evaluation, independently obtained ground truth data are used.


Optical Flow Test Person Iterative Close Point Iterative Close Point Iterative Close Point Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    A. Agarwal, B. Triggs (2004) Tracking articulated motion using a mixture of autoregressive models, In Proc. 8th European Conference on Computer Vision. 54–65Google Scholar
  2. 2.
    H. Akima (1970) A new method of interpolation and smooth curve fitting based on local procedures, Journal of the Association for Computing Machinery. 17(4):589–602CrossRefzbMATHGoogle Scholar
  3. 3.
    B. Barrois, C. Wöhler (2008) Spatio-temporal 3d pose estimation of objects in stereo images, In 6 th International Conference on Computer Vision Systems (ICVS) Google Scholar
  4. 4.
    P. J. Besl, N. D. McKay (1992) A method for registration of 3-d shapes, IEEE Transactions on Pattern Analysis and Machine Intelligence. 14(2):239–256CrossRefGoogle Scholar
  5. 5.
    A. Blake, M. Isard (1998) Active Contours, Springererlag, HeidelbergGoogle Scholar
  6. 6.
    H. H. Bock (1974) Automatische Klassifikation, Vandenhoeck & Ruprecht, Göttingen, Germany.zbMATHGoogle Scholar
  7. 7.
    J. -Y. Bouguet (1997) Camera calibration toolbox for mat lab,
  8. 8.
    G. R. Bradski (1998) Real Time Face and Object Tracking as a Component of a Perceptual User Interface, In Proc.4th IEEE Workshop on Applications of Computer Vision. 214–219Google Scholar
  9. 9.
    D. Bullock, J. Zelek (2005) Towards real-time 3-d monocular visual tracking of human limbs in unconstrained environments, Real-Time Imaging. 11(4):323–353CrossRefGoogle Scholar
  10. 10.
    K. Cheung, S. Baker, T. Kanade (2004) Shape-From-Silhouette Across Time Part I: Theory and Algorithms, Int. J. of Computer Vision. 62(3): 221–247CrossRefGoogle Scholar
  11. 11.
    K. Cheung, S. Baker, T. Kanade (2005) Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Marker less Motion Tracking, Int. J. of Computer Vision. 63(3):225–245CrossRefGoogle Scholar
  12. 12.
    P. d'Angelo, C. Wöhler, L. Krüger (2004) Model based multiview active contours for quality inspection, In Proc. Int. Conf. on Computer Vision and Graphics.Google Scholar
  13. 13.
    Q. Delamarre, O. Faugeras (1998) Finding poses of hand in video images: a stereo-based approach, In Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition.Google Scholar
  14. 14.
    C. Fermüller, Y. Aloimonos (1997) On the Geometry of Visual Correspondence, Int. J. of Computer Vision. 21(3):223–247CrossRefGoogle Scholar
  15. 15.
    U. Franke, A. Joos (2000) Real-time stereo vision for urban traffic scene understanding, In Procs. IEEE Intelligent Vehicles Symposium 2000. (Dearborn, USA) 273–278Google Scholar
  16. 16.
    U. Franke, C. Rabe, H. Badino, S. Gehrig (2005) 6division: Fusion of stereo and motion for robust environment perception, In Proc. 27th DAGM Symposium. 216–223Google Scholar
  17. 17.
    A. Fusiello, E. Trucco, A. Verri (2000) A compact algorithm for rectification of stereo pairs, Machine Vision and Applications. 12(1):16–22CrossRefGoogle Scholar
  18. 18.
    J. Gall, B. Rosenhahn, T. Brox, H. -P. Seidel (2009) Optimization and filtering for human motion capture — a multi-layer framework, Int. J. of Computer Vision. 87(1–2):75–92Google Scholar
  19. 19.
    D. Gavrila (1999) The visual analysis of human movement: A survey, Computer Vision and Image Understanding. 73(1):82–98CrossRefzbMATHGoogle Scholar
  20. 20.
    D. M. Gavrila, L. S. Davis (1996) 3-d model-based tracking of humans in action: a multi-view approach, In Proc. IEEE Conf. on Computer Vision and Pattern Recognition. 73–80Google Scholar
  21. 21.
    M. Hahn, L. Krüger, C. Wöhler (2008a) 3D action recognition and long-term prediction of human motion, In Proc. Int. Conf. on Computer Vision Systems. 23–32Google Scholar
  22. 22.
    M. Hahn, L. Krüger, C. Wöhler (2008b) Spatiotemporal 3D pose estimation and tracking of human body parts using the shape flow algorithm, In Proc. Int. Conf. on Pattern Recognition.Google Scholar
  23. 23.
    M. Hahn, L. Krüger, C. Wöhler, H. -M. Gross. (2007) Tracking of human body parts using the multiocular contracting curve density algorithm, In Proc. 6th Int. Conf. on 3-D Digital Imaging and Modeling. 257–264Google Scholar
  24. 24.
    M. Hahn, L. Krüger, C. Wöhler, F. Kummert (2009) 3D Action Recognition in an Industrial Environment, In Proc. 3rd Int. Workshop on Human-Centered Robotic Systems. 141–150Google Scholar
  25. 25.
    R. Hanek (2001) The contracting curve density algorithm and its application to model-based image segmentation, In Proc. IEEE Computer Vision and Pattern Recognition. 797–804Google Scholar
  26. 26.
    R. Hanek (2004) Fitting Parametric Curve Models to Images Using Local Self-adapting Separation Criteria, PhD thesis, Technische Universität München, München.Google Scholar
  27. 27.
    M. Hofmann, D. M. Gavrila (2009) Multi-view 3D human pose estimation combining single-frame recovery, temporal integration and model adaptation, In Proc. IEEE Conf. on Computer Vision and Pattern Recognition. 2214–2221Google Scholar
  28. 28.
    B. K. P. Horn, B. G. Schunck (1981) Determining optical flow, Artificial Intelligence. 17(1–3):185–203Google Scholar
  29. 29.
    B. Horn (1986) Robot Vision, MIT Press.Google Scholar
  30. 30.
    F. Huguet, F. Devernay (2007) A Variational Method for Scene Flow Estimation from Stereo Sequences, In Proc. Int. Conf. on Computer Vision. 1–7Google Scholar
  31. 31.
    R. Kehl, L. Van Gool (2006) Marker less tracking of complex human motions from multiple views, Computer Vision and Image Understanding. 104(2):190–209CrossRefGoogle Scholar
  32. 32.
    S. Knoop, S. Vacek, R. Dillmann (2006) Sensor fusion for 3d human body tracking with an articulated 3D body model, In Proc. IEEE Int. Conf. on Robotics and Automation. 1686–1691Google Scholar
  33. 33.
    L. Krüger, C. Wöhler (2009) Accurate chequerboard corner localization for camera calibration and scene reconstruction, Submitted to Pattern Recognition Letters.Google Scholar
  34. 34.
    L. Krüger, C. Wöhler, A. Würz-Wessel, F. Stein (2004) In-factory calibration of multiocular camera systems, In Proc. SPIE Photonics Europe. 126–137Google Scholar
  35. 35.
    I. Mikic, M. Trivedi, E. Hunter, P. Cosman (2003) Human Body Model Acquisition and Tracking Using Voxel Data, Int. J. of Computer Vision. 53(3):199–223CrossRefGoogle Scholar
  36. 36.
    T. B. Moeslund, A. Hilton, V. Krüger (2006) A survey of advances in vision based human motion capture and analysis, Computer Vision and Image Understanding. 104(2):90–126CrossRefGoogle Scholar
  37. 37.
    T. B. Moeslund, C. B. Madsen, E. Granum (2005) Modeling the 3D pose of a human arm and the shoulder complex utilizing only two parameters, Integrated Computer-Aided Engineering. 12(2):90–126Google Scholar
  38. 38.
    L. Mündermann, S. Corazza, T. Andriacchi (2007) Accurately measuring human movement using articulated icp with soft-joint constraints and a repository of articulated models, In Proc. IEEE Conf. on Computer Vision and Pattern Recognition.Google Scholar
  39. 39.
    K. Narayanan, R. Kumaran, J. Gowdy (2005) Stereobased elliptical head tracking, In Proc. Europ. Signal Processing Conf. 1565–1568Google Scholar
  40. 40.
    P. Noriega, O. Bernier (2007) Multicues 2d articulated pose tracking using particle filtering and belief propagation on factor graphs, In Proc. IEEE Int. Conf. on Image Processing. V:57–60Google Scholar
  41. 41.
    R. Plänkers, P. Fua (2003) Articulated soft objects for multiview shape and motion capture, IEEE Trans. Pattern Anal. Mach. Intell.. 25(9):1182–1187CrossRefGoogle Scholar
  42. 42.
    M. -D. Ramanan, S. M. -D. Forsyth, S. M. -D. Zisserman (2007) Tracking people by learning their appearance, IEEE Trans. Pattern Anal. Mach. Intell.. 29(1):65–81CrossRefGoogle Scholar
  43. 43.
    W. J. J. Rey (1983) Introduction to Robust and Quasi-Robust Statistical Methods, Springer-Verlag, Berlin.CrossRefzbMATHGoogle Scholar
  44. 44.
    B. Rosenhahn, U. Kersting, K. Powell, R. Klette, G. Klette, H. -P. Seidel (2007) A system for articulated tracking incorporating a clothing model, Machine Vision and Applications. 18(1):25–40CrossRefGoogle Scholar
  45. 45.
    B. Rosenhahn, U. Kersting, A. Smith, J. Gurney, T. Brox, R. Klette (2005) A system for marker-less human motion estimation, In Proc. 27th DAGM Symposium. 230–237Google Scholar
  46. 46.
    B. Rosenhahn, C. Schmaltz, T. Brox, J. Weickert, D. Cremers, H. -Pseidel (2008) Marker less motion capture of man-machine interaction, In Proc. IEEE Conf. on Computer Vision and Pattern Recognition.Google Scholar
  47. 47.
    Sappa, A., N. Aifanti, N. Grammalidis, S. Malassiotis, (2005). Advances in Vision-Based Human Body Modeling, In 3D Modeling and Animation: Synthesis and Analysis Techniques for the Human Body, Sarris, N., Strintzis, M. (eds.), IRM Press. 1–26Google Scholar
  48. 48.
    J. Schmidt (2009) Monokulare Modellbasierte Posturschätzung des Menschlichen Oberkörpers, In Proc. 8. Oldenburger 3D-Tage.Google Scholar
  49. 49.
    J. Schmidt, J. Fritsch, B. Kwolek (2006) Kernel particle filter for real-time 3d body tracking in monocular color images, In Proc. 7th IEEE Int. Conf. on Automatic Face and Gesture Recognition. 567–572Google Scholar
  50. 50.
    J. Schmidt, C. Wöhler, L. Krüger, T. Gövert, C. Hermes (2007) 3D scene segmentation and object tracking in multiocular image sequences, In Proc. 5th Int. Conf. on Computer Vision Systems.Google Scholar
  51. 51.
    B. G. Schunck (1986) The image flow constraint equation, Comput. Vision Graph. Image Process.. 35(1):20–46CrossRefGoogle Scholar
  52. 52.
    M. Siddiqui, G. Medioni (2006) Robust real-time upper body limb detection and tracking, In Proc. 4th ACM Int. Workshop on Video Surveillance and Sensor Networks. 53–60Google Scholar
  53. 53.
    L. Sigal, M. J. Black (2006) Human Eva: Synchronized video and motion capture dataset for evaluation of articulated human motion. Technical report CS-06-08, Brown University.Google Scholar
  54. 54.
    C. Slama (1980) Manual of Photogrammetry (4th edition), American Society for Photogrammetry and Remote Sensing.Google Scholar
  55. 55.
    C. Sminchisescu (2006) 3D human motion analysis in monocular video techniques and challenges, In Proc. IEEE Int. Conf. on Video and Signal, Based Surveillance. 76Google Scholar
  56. 56.
    J. Starck, A. Hilton (2003) Model-Based Multiple View Reconstruction of People, In Proc. IEEE Int. Conf. on Computer Vision. 915–922Google Scholar
  57. 57.
    A. Sundaresan, R. Chellappa (2006) Multi-camera tracking of articulated human motion using motion and shape cues, In Proc. Asian Conf. on Computer Vision. II:131–140Google Scholar
  58. 58.
    L. Wang, W. Hu, T. Tan (2003) Recent developments of human motion analysis, Pattern Recognition. 36(3):585–601CrossRefGoogle Scholar
  59. 59.
    A. Wedel, T. Pock, C. Zach, H. Bischof, D. Cremers, (2008) An improved algorithm for TV-L1 optical flow computation, In Proc. Dagstuhl Visual Motion Analysis Workshop.Google Scholar
  60. 60.
    A. Wedel, C. Rabe, T. Vaudrey, T. Brox, U. Franke, D. Cremers (2008) Efficient Dense Scene Flow from Sparse or Dense Stereo Data, In Proc. European Conf. on Computer Vision. I: 739–751Google Scholar
  61. 61.
    C. Zach, T. Pock, H. Bischof (2007) A duality based approach for realtime TV-L1 optical flow, Proc. 29th DAGM Symposium. 214–223Google Scholar
  62. 62.
    Z. Zhang (1992) Iterative point matching for registration of free-form curves, INRIA Technical report no. 1658Google Scholar
  63. 63.
    J. Ziegler, K. Nickel, R. Stiefelhagen (2006) Tracking of the articulated upper body on multi-view stereo image sequences, In Proc. IEEE Conf. on Computer Vision and Pattern Recognition. 774–781Google Scholar
  64. 64.
    For the image sequences and ground truth data, see

Copyright information

© 3D Display Research Center and Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Markus Hahn
    • 1
  • Björn Barrois
    • 1
  • Lars Krüger
    • 1
  • Christian Wöhler
    • 2
  • Gerhard Sagerer
    • 3
  • Franz Kummert
    • 3
  1. 1.Daimler Group ResearchEnvironment PerceptionUlmGermany
  2. 2.Image Analysis GroupDortmund University of TechnologyDortmundGermany
  3. 3.Applied Informatics, Faculty of TechnologyBielefeld UniversityBielefeldGermany

Personalised recommendations