Journal of Intelligent & Robotic Systems

, Volume 83, Issue 3–4, pp 339–358 | Cite as

Robotic Hand Pose Estimation Based on Stereo Vision and GPU-enabled Internal Graphical Simulation

  • Pedro Vicente
  • Lorenzo Jamone
  • Alexandre Bernardino


Humanoid robots have complex kinematic chains whose modeling is error prone. If the robot model is not well calibrated, its hand pose cannot be determined precisely from the encoder readings, and this affects reaching and grasping accuracy. In our work, we propose a novel method to simultaneously i) estimate the pose of the robot hand, and ii) calibrate the robot kinematic model. This is achieved by combining stereo vision, proprioception, and a 3D computer graphics model of the robot. Notably, the use of GPU programming allows to perform the estimation and calibration in real time during the execution of arm reaching movements. Proprioceptive information is exploited to generate hypotheses about the visual appearance of the hand in the camera images, using the 3D computer graphics model of the robot that includes both kinematic and texture information. These hypotheses are compared with the actual visual input using particle filtering, to obtain both i) the best estimate of the hand pose and ii) a set of joint offsets to calibrate the kinematics of the robot model. We evaluate two different approaches to estimate the 6D pose of the hand from vision (silhouette segmentation and edges extraction) and show experimentally that the pose estimation error is considerably reduced with respect to the nominal robot model. Moreover, the GPU implementation ensures a performance about 3 times faster than the CPU one, allowing real-time operation.


Robot hand pose estimation Robot self-calibration Humanoid robots 3D graphical simulation GPU programming Online reaching adaptation Computer vision 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ashmead, D., McCarty, M., Lucas, L., Belvedere, M.: Visual guidance in infants’ reaching toward suddenly displaced targets. Child Dev. 64, 1111–1127 (1993)CrossRefGoogle Scholar
  2. 2.
    Birbach, O., Bäuml, B., Frese, U.: Automatic and Self-Contained Calibration of a Multi-Sensorial Humanoid’s Upper Body. In: International Conference on Robotics and Automation, Minnesota, USA (2012)Google Scholar
  3. 3.
    Borgefors, G.: Distance transformations in digital images. Comput. Vis. Graph. Image Process 34 (3), 344–371 (1986)CrossRefGoogle Scholar
  4. 4.
    Bradski, G.: The OpenCV Library. Dr. Dobb?s Journal of Software Tools (2000)Google Scholar
  5. 5.
    Canny, J.: A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (1986)Google Scholar
  6. 6.
    Choi, C., Christensen, H.I.: 3D textureless object detection and tracking: an edge-based approach. In: IROS, pp 3877–3884. IEEE (2012)Google Scholar
  7. 7.
    Ciliberto, C., Smeraldi, F., Natale, L., Metta, G.: Online Multiple Instance Learning Applied to Hand Detection in a Humanoid Robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 1526–1532 (2011)Google Scholar
  8. 8.
    Ciocarlie, M., Hsiao, K., Jones, E.G., Chitta, S., Rusu, R.B., Sucan, I.A.: Towards Reliable Grasping and Manipulation in Household Environments. In: International Symposium on Experimental Robotics (ISER), New Delhi, India (2010)Google Scholar
  9. 9.
    Comport, A., Marchand, E., Pressigout, M., Chaumette, F.: Real-time markerless tracking for augmented reality: the virtual visual servoing framework. IEEE Trans. Vis. Comput. Graph. 12(04), 615–628 (2006)CrossRefGoogle Scholar
  10. 10.
    Cox, T.F., Cox, M.: Multidimensional Scaling, 2nd edn. Chapman and Hall/CRC (2000)Google Scholar
  11. 11.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. Comput. Vis. Image Underst. 108, 52–73 (2007)CrossRefGoogle Scholar
  12. 12.
    Fanello, S.R., Pattacini, U., Gori, I., Tikhanoff, V., Randazzo, M., Roncone, A., Odone, F.: Metta, G.: 3D Stereo Estimation and Fully Automated Learning of Eye-Hand Coordination in Humanoid Robots. In: IEEE-RAS International Conference on Humanoid Robots (2014)Google Scholar
  13. 13.
    Fulkerson, B., Soatto, S.: Really Quick Shift: Image Segmentation on a Gpu. In: Workshop on Computer Vision Using GPUs, Held with the European Conference on Computer Vision (2010)Google Scholar
  14. 14.
    Gratal, X., Romero, J., Bohg, J., Kragic, D.: Visual servoing on unknown objects. Mechatronics 22(4), 423–435 (2012)CrossRefGoogle Scholar
  15. 15.
    Gratal, X., Romero, J., Kragic, D.: Virtual Visual Servoing for Real-Time Robot Pose Estimation. In: Proceedings of the 18Th IFAC World Congress, pp 9017–9022 (2011)Google Scholar
  16. 16.
    Hoffmann, M., Marques, H., Hernandez Arieta, A., Sumioka, H., Lungarella, M., Pfeifer, R.: Body schema in robotics: a review. IEEE Trans. Auton. Ment. Dev. 2(4), 304–324 (2010)CrossRefGoogle Scholar
  17. 17.
    Hol, J.D., Schon, T.B., Gustafsson, F.: On Resampling Algorithms for Particle Filters. In: IEEE Nonlinear Statistical Signal Processing Workshop, pp 79–82 (2006)Google Scholar
  18. 18.
    Jamone, L., Brandao, M., Natale, L., Hashimoto, K., Sandini, G., Takanishi, A.: Autonomous online generation of a motor representation of the workspace for intelligent whole-body reaching. Robot. Auton. Syst. 64(4), 556–567 (2014)CrossRefGoogle Scholar
  19. 19.
    Jamone, L., Damas, B., Endo, N., Santos-Victor, J., Takanishi, A.: Incremental development of multiple tool models for robotic reaching through autonomous exploration. PALADYN J. Behav. Robot. 03(03), 113–127 (2013)Google Scholar
  20. 20.
    Jamone, L., Damas, B., Santos-Victor, J., Takanishi, A.: Online Learning of Humanoid Robot Kinematics under Switching Tools Contexts. In: IEEE-RAS International Conference on Robotics and Automation (ICRA), pp 4811–4817 (2013)Google Scholar
  21. 21.
    Jamone, L., Natale, L., Nori, F., Metta, G., Sandini, G.: Autonomous online learning of reaching behavior in a humanoid robot. Int. J. Human. Robot. 09(03), 1250,017 (2012)CrossRefGoogle Scholar
  22. 22.
    Klingensmith, M., Galluzzo, T., Dellin, C., Kazemi, M., Bagnell, J.A., Pollard, N.: Closed-Loop Servoing Using Real-Time Markerless Arm Tracking. In: IEEE-RAS International Conference on Robotics and Automation (ICRA) - Humanoids Workshop (2013)Google Scholar
  23. 23.
    Kording, K.P., Wolpert, D.M.: Bayesian integration in sensorimotor learning. Nature 427, 244–247 (2004)CrossRefGoogle Scholar
  24. 24.
    Lee, V.W., Kim, C., Chhugani, J., Deisher, M., Kim, D., Nguyen, A.D., Satish, N., Smelyanskiy, M., Chennupaty, S., Hammarlund, P., Singhal, R., Dubey, P.: Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU. SIGARCH Comput. Archit. News 38(3), 451–460 (2010)CrossRefGoogle Scholar
  25. 25.
    Leitner, J., Harding, S., Frank, M., Forster, A., Schmidhuber, J.: Humanoid Learns to Detect Its Own Hands. In: IEEE Congress on Evolutionary Computation (CEC), pp 1411–1418 (2013)Google Scholar
  26. 26.
    Lockman, J.J., Ashmead, D.H., Bushnell, E.W.: The development of anticipatory hand orientation during infancy. J. Exp. Child Psychol. 37, 176–186 (1984)CrossRefGoogle Scholar
  27. 27.
    Mathew, A., Cook, M.: The control of reaching movements by young infants. Child Dev. 61, 1238–1257 (1990)CrossRefGoogle Scholar
  28. 28.
    Metta, G., Fitzpatrick, P., Natale, L.: YARP: Yet Another robot platform international journal on advanced robotics systems (2006)Google Scholar
  29. 29.
    Metta, G., Natale, L., Nori, F., Sandini, G., Vernon, D., Fadiga, L., von Hofsten, C., Rosander, K., Lopes, M., Santos-Victor, J., Bernardino, A., Montesano, L.: The icub humanoid robot: an open-systems platform for research in cognitive development. Neural Netw. 23 (2010)Google Scholar
  30. 30.
    Moutinho, N., Brandao, M., Ferreira, R., Gaspar, J., Bernardino, A., Takanishi, A., Santos-Victor, J.: Online Calibration of a Humanoid Robot Head from Relative Encoders, Imu Readings and Visual Data. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 2070–2075 (2012)Google Scholar
  31. 31.
    NaturalPoint: Optitrack Motion Capture System. [Online; accessed 1-06-2015]
  32. 32.
    Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008)CrossRefGoogle Scholar
  33. 33.
    Oikonomidis, I., Kyriazis, N., Argyros, A.: Markerless and Efficient 26-Dof Hand Pose Recovery. In: 10Th Asian Conference on Computer Vision, Queenstown, New Zealand (2010)Google Scholar
  34. 34.
    Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient Model-Based 3D Tracking of Hand Articulations Using Kinect. In: Proceedings of the British Machine Vision Conference, pp 101.1–101.11. BMVA Press (2011)Google Scholar
  35. 35.
    Park, S.I., Ponce, S., Huang, J., Cao, Y., Quek, F.: Low-Cost, High-Speed Computer Vision Using Nvidia’s Cuda Architecture. In: Applied Imagery Pattern Recognition Workshop, 2008. AIPR ’08. 37Th IEEE, pp 1–7 (2008)Google Scholar
  36. 36.
    Pattacini, U.: Modular Cartesian Controllers for Humanoid Robots: Design and Implementation on the Icub. Ph.D. Thesis, Italian Institute of Technology (2011)Google Scholar
  37. 37.
    Periquito, D., Nascimento, J., Bernardino, A., Sequeira, J.: Vision-Based Hand Pose Estimation: a Mixed Bottom-Up and Top-Down Approach. In: 8Th International Conference on Computer Vision Theory and Applications (VISAPP), Barcelona, Spain (2013)Google Scholar
  38. 38.
    Saxena, A., Driemeyer, J., Ng, A.Y.: Robotic grasping of novel objects using vision. Int. J. Rob. Res. 27(2), 157–173 (2008)CrossRefGoogle Scholar
  39. 39.
    Shreiner, D., Sellers, G., Kessenich, J.M., Licea-Kane, B.M.: OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 4.3, 8th edn. Addison-Wesley Professional (2013)Google Scholar
  40. 40.
    Sinha, S.N.: GPUbased Video Feature Tracking and Matching. Technical Report (2006)Google Scholar
  41. 41.
    Thrun, S., Burgard, W., Fox, D.: Probabilistic robotics (intelligent robotics and autonomous agents). The MIT press (2005)Google Scholar
  42. 42.
    Ulbrich, S., de Angulo, V., Asfour, T., Torras, C., Dillmann, R.: Rapid Learning of Humanoid Body Schemas with Kinematic Bézier Maps. In: IEEE-RAS International Conference on Humanoid Robots (Humanoids), pp 431–438 (2009)Google Scholar
  43. 43.
    Ulbrich, S., de Angulo, V.R., Asfour, T., Torras, C., Dillmann, R.: General robot kinematics decomposition without intermediate markers. IEEE Trans. Neural Netw. Learn. Syst. 23, 620–630 (2012)CrossRefGoogle Scholar
  44. 44.
    Vicente, P., Ferreira, R., Jamone, L., Bernardino, A.: Eye-Hand Online Adaptation during Reaching Tasks in a Humanoid Robot. In: Joint IEEE International Conferences on Development and Learning and Epigenetic Robotics (ICDL-Epirob), pp 175–180 (2014)Google Scholar
  45. 45.
    Vicente, P., Ferreira, R., Jamone, L., Bernardino, A.: Gpu-Enabled Particle Based Optimization for Robotic-Hand Pose Estimation and Self-Calibration. In: Robotica / IEEE International Conference on Autonomous Robot Systems and Competitions (Robotica/ICARSC) (2015)Google Scholar
  46. 46.
    Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. ACM Trans. Graph. 28 (3) (2009)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Pedro Vicente
    • 1
  • Lorenzo Jamone
    • 1
  • Alexandre Bernardino
    • 1
  1. 1.Institute for Systems and Robotics (ISR/IST), LARSySInstituto Superior Técnico, University LisboaLisboaPortugal

Personalised recommendations