Hand Orientation Regression Using Random Forest for Augmented Reality

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8853)


We present a regression method for the estimation of hand orientation using an uncalibrated camera. For training the system, we use a depth camera to capture a large dataset of hand color images and orientation angles. Each color image is segmented producing a silhouette image from which contour distance features are extracted. The orientation angles are captured by robustly fitting a plane to the depth image of the hand, providing a surface normal encoding the hand orientation in 3D space. We then train multiple Random Forest regressors to learn the non-linear mapping from the space of silhouette images to orientation angles. For online testing of the system, we only require a standard 2D image to infer the 3D hand orientation. Experimental results show the approach is computationally efficient, does not require any camera calibration, and is robust to inter-person shape variation.


Orientation estimation Random forest regression Silhouette image Hand 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allison, S.: Wearable tech - the future, or just a fad? (February 2014) (Online; posted February 13, 2014)Google Scholar
  2. 2.
    Olsson, M.I., Martin, M.W., Hebenstreit, J.J., Cazalet, P.M.: Wearable device with input and output structures. US Patent App. 14/037, 788 (2013)Google Scholar
  3. 3.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. Computer Vision and Image Understanding 108(1), 52–73 (2007)CrossRefGoogle Scholar
  4. 4.
    Wu, Y., Huang, T.S.: Capturing articulated human hand motion: A divide-and-conquer approach. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision (ICCV), vol. 1, pp. 606–611. IEEE (1999)Google Scholar
  5. 5.
    Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3d hand pose reconstruction using specialized mappings. In: Proceedings of Eighth IEEE International Conference on Computer Vision (ICCV), vol. 1, pp. 378–385. IEEE (2001)Google Scholar
  6. 6.
    de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3d hand pose estimation from monocular video. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(9), 1793–1805 (2011)CrossRefGoogle Scholar
  7. 7.
    Lee, T., Hollerer, T.: Handy ar: Markerless inspection of augmented reality objects using fingertip tracking. In: 11th IEEE International Symposium on Wearable Computers, pp. 83–90. IEEE (2007)Google Scholar
  8. 8.
    Lee, T., Hollerer, T.: Hybrid feature tracking and user interaction for markerless augmented reality. In: IEEE Virtual Reality Conference (VR 2008), pp. 145–152. IEEE (2008)Google Scholar
  9. 9.
    Lee, T., Hollerer, T.: Multithreaded hybrid feature tracking for markerless augmented reality. IEEE Transactions on Visualization and Computer Graphics 15(3), 355–368 (2009)CrossRefGoogle Scholar
  10. 10.
    Kato, H., Kato, T.: A marker-less augmented reality based on fast fingertip detection for smart phones. In: IEEE International Conference on Consumer Electronics (ICCE), pp. 127–128. IEEE (2011)Google Scholar
  11. 11.
    Agarwal, A., Triggs, B.: Recovering 3d human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(1), 44–58 (2006)CrossRefGoogle Scholar
  12. 12.
    Elgammal, A., Lee, C.-S.: Inferring 3d body pose from silhouettes using activity manifold learning. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. II-681. IEEE (2004)Google Scholar
  13. 13.
    Franco, J.-S., Boyer, E.: Fusion of multiview silhouette cues using a space occupancy grid. In: Tenth IEEE International Conference on Computer Vision (ICCV), vol. 2, pp. 1747–1753. IEEE (2005)Google Scholar
  14. 14.
    Causo, A., Ueda, E., Kurita, Y., Matsumoto, Y., Ogasawara, T.: Model-based hand pose estimation using multiple viewpoint silhouette images and unscented kalman filter. In: The 17th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN, pp. 291–296. IEEE (2008)Google Scholar
  15. 15.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  16. 16.
    Yoruk, E., Konukoglu, E., Sankur, B., Darbon, J.: Shape-based hand recognition. IEEE Transactions on Image Processing 15(7), 1803–1815 (2006)CrossRefGoogle Scholar
  17. 17.
    Leventon, M.E., Grimson, W.E.L., Faugeras, O.: Statistical shape influence in geodesic active contours. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 316–323. IEEE (2000)Google Scholar
  18. 18.
    Šarić, M.: Libhand: A library for hand articulation, Version 0.9 (2011)Google Scholar
  19. 19.
    Shotton, J., Girshick, R., Fitzgibbon, A., Sharp, T., Cook, M., Finocchio, M., Moore, R., Kohli, P., Criminisi, A., Kipman, A., et al.: Efficient human pose estimation from single depth images. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(12), 2821–2840 (2013)CrossRefGoogle Scholar
  20. 20.
    Keskin, C., Kıraç, F., Kara, Y.E., Akarun, L.: Real time hand pose estimation using depth sensors. In: Consumer Depth Cameras for Computer Vision, pp. 119–137. Springer (2013)Google Scholar
  21. 21.
    Sharp, T.: Implementing decision trees and forests on a gpu. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 595–608. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.City University LondonLondonUK

Personalised recommendations