Advertisement

Hand Pose Estimation and Hand Shape Classification Using Multi-layered Randomized Decision Forests

  • Cem Keskin
  • Furkan Kıraç
  • Yunus Emre Kara
  • Lale Akarun
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7577)

Abstract

Vision based articulated hand pose estimation and hand shape classification are challenging problems. This paper proposes novel algorithms to perform these tasks using depth sensors. In particular, we introduce a novel randomized decision forest (RDF) based hand shape classifier, and use it in a novel multi–layered RDF framework for articulated hand pose estimation. This classifier assigns the input depth pixels to hand shape classes, and directs them to the corresponding hand pose estimators trained specifically for that hand shape. We introduce two novel types of multi–layered RDFs: Global Expert Network (GEN) and Local Expert Network (LEN), which achieve significantly better hand pose estimates than a single–layered skeleton estimator and generalize better to previously unseen hand poses. The novel hand shape classifier is also shown to be accurate and fast. The methods run in real–time on the CPU, and can be ported to the GPU for further increase in speed.

Keywords

Posterior Probability Leaf Node Depth Image Spectral Cluster Depth Sensor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR (2011)Google Scholar
  2. 2.
    Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient Regression of General-Activity Human Poses from Depth Images. In: Proceedings Thirteenth IEEE International Conference on Computer Vision, ICCV 2011, vol. 2011, pp. 415–422. IEEE Comput. Soc. (2011)Google Scholar
  3. 3.
    Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)zbMATHCrossRefGoogle Scholar
  4. 4.
    Ye, M., Wang, X., Yang, R., Ren, L., Pollefeys, M.: Accurate 3D pose estimation from a single depth image. In: Proceedings Thirteenth IEEE International Conference on Computer Vision, ICCV 2011, vol. 2011, pp. 731–738. IEEE Comput. Soc. (2011)Google Scholar
  5. 5.
    Lopez-Mendez, A., Alcoverro, M., Pardas, M., Casas, J.R.: Real-time upper body tracking with online initialization using a range sensor. In: Proceedings Thirteenth IEEE International Conference on Computer Vision Workshops, ICCV 2011, vol. 2011, pp. 391–398. IEEE Comput. Soc. (2011)Google Scholar
  6. 6.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. Computer Vision and Image Understanding 108, 52–73 (2007)CrossRefGoogle Scholar
  7. 7.
    Singh, V.K., Nevatia, R.: Action recognition in cluttered dynamic scenes using pose-specific part models. In: Proceedings Thirteenth IEEE International Conference on Computer Vision, ICCV 2011, vol. 2011, pp. 113–120. IEEE Comput. Soc. (2011)Google Scholar
  8. 8.
    de Campos, T., Murray, D.: Regression-based Hand Pose Estimation from Multiple Cameras. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 1, pp. 782–789 (2006)Google Scholar
  9. 9.
    Athitsos, V., Sclaroff, S.: Estimating 3D hand pose from a cluttered image. In: Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p. II–432–9 (2003)Google Scholar
  10. 10.
    Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3D hand pose reconstruction using specialized mappings. In: Proceedings Eighth IEEE International Conference on Computer Vision, ICCV 2001, vol. 2000, pp. 378–385. IEEE Comput. Soc. (2001)Google Scholar
  11. 11.
    Stenger, B., Mendonça, P., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, pp. II–310–II–315. IEEE Comput. Soc. (2001)Google Scholar
  12. 12.
    de La Gorce, M., Fleet, D.J., Paragios, N.: Model-Based 3D Hand Pose Estimation from Monocular Video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1–14 (2011)Google Scholar
  13. 13.
    Mo, Z., Neumann, U.: Real-time Hand Pose Recognition Using Low-Resolution Depth Images. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 2, pp. 1499–1505 (2006)Google Scholar
  14. 14.
    Malassiotis, S., Strintzis, M.: Real-time hand posture recognition using range data. Image and Vision Computing 26, 1027–1037 (2008)CrossRefGoogle Scholar
  15. 15.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Markerless and Efficient 26-DOF Hand Pose Recovery. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 744–757. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    Keskin, C., Kirac, F., Kara, Y.E., Akarun, L.: Real-time hand pose estimation using depth sensors. In: Proceedings Thirteenth IEEE International Conference on Computer Vision Workshops, ICCV 2011, pp. 1228–1234. IEEE Comput. Soc. (2011)Google Scholar
  17. 17.
    Ong, E.J., Bowden, R.: A boosted classifier tree for hand shape detection. In: Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, FGR 2004, pp. 889–894. IEEE Computer Society, Washington, DC (2004)Google Scholar
  18. 18.
    Pugeault, N., Bowden, R.: Spelling It Out: Real Time ASL Fingerspelling Recognition. In: Proceedings of the 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision, in Conjunction with ICCV 2011, vol. 2011. IEEE Comput. Soc. (2011)Google Scholar
  19. 19.
    Uebersax, D., Gall, J., Van den Bergh, M., Van Gool, L.: Real-time sign language letter and word recognition from depth data. In: Proceedings Thirteenth IEEE International Conference on Computer Vision, ICCV 2011, vol. 2011, pp. 383–390. IEEE Comput. Soc. (2011)Google Scholar
  20. 20.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 603–619 (2002)CrossRefGoogle Scholar
  21. 21.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. In: Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition, CVPR 1997, pp. 731–737. IEEE Computer Society, Washington, DC (1997)Google Scholar
  22. 22.
    Meila, M., Shi, J.: A random walks view of spectral segmentation (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Cem Keskin
    • 1
  • Furkan Kıraç
    • 1
  • Yunus Emre Kara
    • 1
  • Lale Akarun
    • 1
  1. 1.Computer Engineering DepartmentBoğaziçi UniversityIstanbulTurkey

Personalised recommendations