3D Mean-Shift Tracking of Human Body Parts and Recognition of Working Actions in an Industrial Environment

  • Markus Hahn
  • Fuad Quronfuleh
  • Christian Wöhler
  • Franz Kummert
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6219)


In this study we describe a method for 3D trajectory based recognition of and discrimination between different working actions in an industrial environment. A motion-attributed 3D point cloud represents the scene based on images of a small-baseline trinocular camera system. A two-stage mean-shift algorithm is used for detection and 3D tracking of all moving objects in the scene. A sequence of working actions is recognised with a particle filter based matching of a non-stationary Hidden Markov Model, relying on spatial context and a classification of the observed 3D trajectories. The system is able to extract an object performing a known action out of a multitude of tracked objects. The 3D tracking stage is evaluated with respect to its metric accuracy based on nine real-world test image sequences for which ground truth data were determined. An experimental evaluation of the action recognition stage is conducted using 20 real-world test sequences acquired from different viewpoints in an industrial working environment. We show that our system is able to perform 3D tracking of human body parts and a subsequent recognition of working actions under difficult, realistic conditions. It detects interruptions of the sequence of working actions by entering a safety mode and returns to the regular mode as soon as the working actions continue.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Black, M.J., Jepson, A.D.: A probabilistic framework for matching temporal trajectories: Condensation-based recognition of gestures and expressions. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 909–924. Springer, Heidelberg (1998)Google Scholar
  2. 2.
    Bradski, G.R.: Real time face and object tracking as a component of a perceptual user interface. In: Workshop on Appl. of Computer Vision, pp. 214–219 (1998)Google Scholar
  3. 3.
    Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)CrossRefGoogle Scholar
  4. 4.
    Comaniciu, D., Ramesh, V., Meer, P.: Real-time Tracking of Non-Rigid Objects Using Mean Shift. In: IEEE Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 142–149 (2000)Google Scholar
  5. 5.
    Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25(5), 564–577 (2003)CrossRefGoogle Scholar
  6. 6.
    Gehrig, S.K., Eberli, F., Meyer, T.: A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching. In: Fritz, M., Schiele, B., Piater, J.H. (eds.) ICVS 2009. LNCS, vol. 5815, pp. 134–143. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Hahn, M., Krüger, L., Wöhler, C.: 3D action recognition and long-term prediction of human motion. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 23–32. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Hahn, M., Krüger, L., Wöhler, C., Kummert, F.: 3D action recognition in an industrial environment. In: Ritter, H., Sagerer, G., Dillmann, R., Buss, M. (eds.) Proc. 3rd Int. Workshop on Human-Centered Robotic Systems, Cognitive Systems Monographs, vol. 6, pp. 141–150. Springer, Heidelberg (2009)Google Scholar
  9. 9.
    Hahn, M., Krüger, L., Wöhler, C., Sagerer, G., Kummert, F.: Spatio-temporal 3D Pose Estimation and Tracking of Human Body Parts in an Industrial Environment. In: Oldenburger 3D-Tage (2010)Google Scholar
  10. 10.
    Krüger, L., Wöhler, C.: Accurate chequerboard corner localisation for camera calibration and scene reconstruction. Submitted to Pattern Recog. Lett. (2009)Google Scholar
  11. 11.
    Li, Z., Fritsch, J., Wachsmuth, S., Sagerer, G.: An object-oriented approach using a top-down and bottom-up process for manipulative action recognition. In: Franke, K., Müller, K.-R., Nickolay, B., Schäfer, R. (eds.) DAGM 2006. LNCS, vol. 4174, pp. 212–221. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    Tyagi, A., Keck, M., Davis, J.W., Potamianos, G.: Kernel-Based 3D Tracking. In: IEEE Conf. on Computer Vision and Pattern Recognition (2007)Google Scholar
  13. 13.
    Wedel, A., Pock, T., Zach, C., Bischof, H., Cremers, D.: An improved algorithm for TV-L1 optical flow computation. In: Dagstuhl Visual Motion Analysis Workshop (2008)Google Scholar
  14. 14.
    Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., Cremers, D.: Efficient dense scene flow from sparse or dense stereo data. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 739–751. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Markus Hahn
    • 1
  • Fuad Quronfuleh
    • 1
  • Christian Wöhler
    • 2
  • Franz Kummert
    • 3
  1. 1.Daimler AG, Group Research and Advanced EngineeringUlmGermany
  2. 2.Image Analysis GroupDortmund University of TechnologyDortmundGermany
  3. 3.Applied InformaticsBielefeld UniversityBielefeldGermany

Personalised recommendations