Skip to main content

State-of-the-Art

  • Chapter
  • First Online:
Visual Perception for Humanoid Robots

Part of the book series: Cognitive Systems Monographs ((COSMOS,volume 38))

  • 474 Accesses

Abstract

This chapter presents a perspective of the state-of-the-art on technology and methodology for visual perception for humanoid robots. The focus is placed on relevant work in robot vision for object recognition with 6D-pose estimation and vision-based 6D global self-localization. Especial emphasis is paid to visual perception capabilities in human-centered environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    “Simultaneous Localization And Mapping”, is the wide spread usage of the acronym http://openslam.org/. The first appearance of the acronym was at the “Software Library for Appearance Modeling” in the work of H. Murase et al. [25] http://www.cs.columbia.edu/CAVE/software/softlib/slam.php.

  2. 2.

    A quasi-perceptual experience, it resembles a visual experience, but occurs in the absence of the external stimuli (see [35]).

  3. 3.

    Percept is the perceptional input object, a mental impression of something perceived by the senses, viewed as the basic component in the formation of concepts (see cognition within artificial intelligence in [65]).

  4. 4.

    Active vision systems interact with the environment by controlling viewpoint, exposure and other acquisition parameters instead of passively observe the scene with a fixed configuration. Usually, active sensing operates on sequences of images rather than single image, see [66].

References

  1. Kato, I. 1973. Development of WABOT-1. Biomechanism 2, 173–214. Tokyo: The University of Tokyo Press.

    Google Scholar 

  2. Waseda. 2012. Humanoid History -WABOT. 2012, humanoid Robotics Institute, Waseda University.

    Google Scholar 

  3. Sugano, S., and I. Kato. 1987. WABOT-2: Autonomous Robot with Dexterous Finger-arm–Finger-arm Coordination Control in Keyboard Performance. In IEEE International Conference on Robotics and Automation, vol. 4, 90–97.

    Google Scholar 

  4. Konno, A., K. Nagashima, R. Furukawa, K. Nishiwaki, T. Noda, M. Inaba, and H. Inoue. 1997. Development of a Humanoid Robot Saika. In IEEE-RSJ International Conference on Intelligent Robots and Systems, vol. 2, 805–810.

    Google Scholar 

  5. Yamaguchi, J., S. Inoue, D. Nishino, and A. Takanishi. 1998. Development of a Bipedal Humanoid Robot Having Antagonistic Driven Joints and Three DOF Trunk. In IEEE-RSJ International Conference on Intelligent Robots and Systems, vol. 1, 96–101.

    Google Scholar 

  6. Brooks, R., C. Breazeal, M. Marjanović, B. Scassellati, and M. Williamson. 1999. The Cog Project: Building a Humanoid Robot. In Computation for Metaphors, Analogy, and Agents, ed. Nehaniv C., 52–87. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer.

    Google Scholar 

  7. Scassellati, B. 1998. A Binocular, Foveated Active Vision System. Technical report, MIT Artificial Intelligence Lab, Cambridge, MA, USA

    Google Scholar 

  8. Breazeal, C., A. Edsinger, P. Fitzpatrick, and B. Scassellati. 2001. Active Vision for Sociable Robots. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 31 (5): 443–453.

    Google Scholar 

  9. Pfeifer, R., and J. Bongard. 2006. How the Body Shapes the Way We Think: A New View of Intelligence. New York: A Bradford Book.

    Google Scholar 

  10. Inaba, M., T. Igarashi, S. Kagami, and H. Inoue. 1996. A 35 DOF Humanoid that Can Coordinate Arms and Legs in Standing up, Reaching and Grasping an Object. In IEEE-RSJ International Conference on Intelligent Robots and Systems, vol. 1, 29–36.

    Google Scholar 

  11. M. Inaba and H. Inoue, Robot Vision Server, International Symposium Industrial Robots, pp. 195–202, 1989.

    Google Scholar 

  12. Jin, Y., and M. Xie. 2000. Vision Guided Homing for Humanoid Service Robot. In International Conference on Pattern Recognition, vol. 4, 511–514.

    Google Scholar 

  13. Saegusa, R., G. Metta, and G. Sandini. 2010. Own Body Perception based on Visuomotor Correlation. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 1044–1051.

    Google Scholar 

  14. Bartolozzi, C., F. Rea, C. Clercq, M. Hofstatter, D. Fasnacht, G. Indiveri, and G. Metta. 2011. Embedded Neuromorphic Vision for Humanoid Robots. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 129–135.

    Google Scholar 

  15. Vernon, D., G. Metta, and G. Sandini. 2007. The iCub Cognitive Architecture: Interactive Development in a Humanoid Robot. In IEEE International Conference on Development and Learning, 122–127.

    Google Scholar 

  16. Metta, G., L. Natale, F. Nori, and G. Sandini. 2011. The iCub Project: An Open Source Platform for Research in Embodied Cognition. In IEEE Workshop on Advanced Robotics and its Social Impacts, 24–26.

    Google Scholar 

  17. Azad, P. 2008. Visual Perception for Manipulation and Imitation in Humanoid Robots. Ph.D. dissertation, University of Karlsruhe. ISBN 978-3-642-04229-4.

    Google Scholar 

  18. Welke, K. 2011. Memory-Based Active Visual Search for Humanoid Robots. Ph.D. dissertation, KIT, Karlsruhe Institute of Technology, Computer Science Faculty, Institute for Anthropometrics.

    Google Scholar 

  19. Thompson, S., and S. Kagami. 2005. Humanoid Robot Localisation using Stereo Vision. In IEEE-RAS International Conference on Humanoid Robots, 19–25.

    Google Scholar 

  20. Lowe, D. 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60 (2): 91–110.

    Google Scholar 

  21. Konolige, K. Small Vision System.

    Google Scholar 

  22. Kagami, S., K. Nishiwaki, J.J. Kuffner, Y. Kuniyoshi, M. Inaba, and H. Inoue. 2002. Online 3D Vision, Motion Planning and Bipedal Locomotion Control Coupling System of Humanoid Robot: H7. In IEEE-RSJ International Conference on Intelligent Robots and Systems, vol. 3, 2557–2562.

    Google Scholar 

  23. Dellaert, F., D. Fox, W. Burgard, and S. Thrun. 1999. Monte Carlo Localization for Mobile Robots. In IEEE International Conference on Robotics and Automation, vol. 2, 1322–1328.

    Google Scholar 

  24. Stasse, O., A.J. Davison, R. Sellaouti, and K. Yokoi. 2006. Real-time 3D SLAM for Humanoid Robot considering Pattern Generator Information. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 348–355.

    Google Scholar 

  25. Murase, H., and S. Nayar. 1993. Learning and Recognition of 3D Objects from Appearance. In IEEE Workshop on Qualitative Vision, 39–50.

    Google Scholar 

  26. Davison, A., and N. Kita. 2001. 3D Simultaneous Localisation and Map-building using Active Vision for a Robot Moving on Undulating Terrain. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 384–391.

    Google Scholar 

  27. Davison, A., I. Reid, N. Molton, and O. Stasse. 2007. MonoSLAM: Real-Time Single Camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (6): 1052–1067.

    Google Scholar 

  28. Okada, K., M. Kojima, Y. Sagawa, T. Ichino, K. Sato, and M. Inaba. 2006. Vision based Behavior Verification System of Humanoid Robot for Daily Environment Tasks. In IEEE-RAS International Conference on Humanoid Robots, 7–12.

    Google Scholar 

  29. Okada, K., M. Kojima, S. Tokutsu, T. Maki, Y. Mori, and M. Inaba. 2007. Multi-cue 3D Object Recognition in Knowledge-based Vision-guided Humanoid Robot System. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 3217–3222.

    Google Scholar 

  30. Okada, K., M. Kojima, S. Tokutsu, Y. Mori, T. Maki, and M. Inaba. 2008. Task Guided Attention Control and Visual Verification in Tea Serving by the Daily Assistive Humanoid HRP2JSK. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 1551–1557.

    Google Scholar 

  31. Brooks, R., and L. Stein. 1994. Building Brains for Bodies. Autonomous Robots 1: 7–25.

    Google Scholar 

  32. Kuffner, J., K. Nishiwaki, S. Kagami, M. Inaba, and H. Inoue. 2003. Motion Planning for Humanoid Robots. In International Symposium on Robotics Research, 365–374.

    Google Scholar 

  33. Prats, M., S. Wieland, T. Asfour, A. del Pobil, and R. Dillmann. 2008. Compliant Interaction in Household Environments by the Armar-III Humanoid Robot. In IEEE-RAS International Conference on Humanoid Robots, 475–480.

    Google Scholar 

  34. Wieland, S., D. Gonzalez-Aguirre, T. Asfour, and R. Dillmann. 2009. Combining Force and Visual Feedback for Physical Interaction Tasks in Humanoid Robots. In IEEE-RAS International Conference on Humanoid Robots, 439–446.

    Google Scholar 

  35. Gonzalez-Aguirre, D., S. Wieland, T. Asfour, and R. Dillmann. 2009. On Environmental Model-Based Visual Perception for Humanoids. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Lecture Notes in Computer Science, eds. Bayro-Corrochano, E., and J.-O. Eklundh, vol. 5856, 901–909. Berlin, Heidelberg: Springer.

    Google Scholar 

  36. Harris, C., and M. Stephens. 1988. A Combined Corner and Edge Detector. In Alvey Vision Conference, 147–151. Manchester, UK.

    Google Scholar 

  37. Noskovicova, L., and R. Ravas. 2010. Subpixel Corner Detection for Camera Calibration. In MECHATRONIKA, International Symposium, 78–80.

    Google Scholar 

  38. Shi, J., and C. Tomasi. 1994. Good Features to Track. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 593–600.

    Google Scholar 

  39. Pérez, P., C. Hue, J. Vermaak, and M. Gangnet. 2002. Color-Based Probabilistic Tracking. In European Conference on Computer Vision-Part I, 661–675. Berlin: Springer.

    Google Scholar 

  40. Canny, J. 1986. A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 8 (6): 679–698.

    Google Scholar 

  41. Flint, A., C. Mei, I. Reid, and D. Murray. 2010. Growing Semantically Meaningful Models for Visual SLAM. In IEEE Conference on Computer Vision and Pattern Recognition, 467–474.

    Google Scholar 

  42. Wen, F., X. Chai, Y. Li, W. Zou, K. Yuan, and P. Chen. 2011. An Improved Visual SLAM Algorithm based on Mixed Data Association. In World Congress on Intelligent Control and Automation, 1081–1086.

    Google Scholar 

  43. Civera, J., D. Galvez-Lopez, L. Riazuelo, J.D. Tardos, and J.M.M. Montiel. 2011. Towards Semantic SLAM using a Monocular Camera. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 1277–1284.

    Google Scholar 

  44. Ahn, S., M. Choi, J. Choi, and W.K. Chung. 2006. Data Association Using Visual Object Recognition for EKF-SLAM in Home Environment. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 2588–2594.

    Google Scholar 

  45. Bay, H., T. Tuytelaars, and L.V. Gool. 2006. Surf: Speeded up Robust Features. In European Conference on Computer Vision, 404–417.

    Google Scholar 

  46. Jeong, W.Y., and K.M. Lee. 2006. Visual SLAM with Line and Corner Features. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 2570–2575.

    Google Scholar 

  47. Klank, U., D. Pangercic, R. Rusu, and M. Beetz. 2009. Real-time CAD Model Matching for Mobile Manipulation and Grasping. In IEEE-RAS International Conference on Humanoid Robots, 290–296.

    Google Scholar 

  48. Rusu, R.B., G. Bradski, R. Thibaux, and J. Hsu. 2010. Fast 3D Recognition and Pose Using the Viewpoint Feature Histogram. In IEEE-RSJ International Conference on Intelligent Robots and Systems.

    Google Scholar 

  49. Meeussen, W., M. Wise, S. Glaser, S. Chitta, C. McGann, P. Mihelich, E. Marder-Eppstein, M. Muja, V. Eruhimov, T. Foote, J. Hsu, R. Rusu, B. Marthi, G. Bradski, K. Konolige, B. Gerkey, and E. Berger. 2010. Autonomous Door Opening and Plugging in with a Personal Robot. In IEEE International Conference on Robotics and Automation, 729–736.

    Google Scholar 

  50. Sturm, J., K. Konolige, C. Stachniss, and W. Burgard. 2010. 3D Pose Estimation, Tracking and Model Learning of Articulated Objects from Dense Depth Video using Projected Texture Stereo. In Workshop RGB-D: Advanced Reasoning with Depth Cameras at Robotics: Science and Systems.

    Google Scholar 

  51. Muja, M., R.B. Rusu, G. Bradski, and D.G. Lowe. 2011. REIN - A Fast, Robust, Scalable Recognition Infrastructure. In IEEE International Conference on Robotics and Automation, 2939–2946.

    Google Scholar 

  52. Kakiuchi, Y., R. Ueda, K. Okada, and M. Inaba. 2011. Creating Household Environment Map for Environment Manipulation using Color Range Sensors on Environment and Robot. In IEEE International Conference on Robotics and Automation, 305–310.

    Google Scholar 

  53. Chin, R., and C. Dyer. 1986. Model-based Recognition in Robot Vision. ACM Computing Surveys 18: 67–108.

    Google Scholar 

  54. Ullman, S. 2000. High-Level Vision, Object Recognition and Visual Cognition. The MIT press. ISBN-10: 0-262-71007-2.

    Google Scholar 

  55. Lepetit, V., and P. Fua. 2005. Monocular Model-Based 3D Tracking of Rigid Objects: A Survey. Foundations and Trends in Computer Graphics and Vision 1 (1): 1–89.

    Google Scholar 

  56. Roth, P.M., and M. Winter. 2008. Survey of Appearance-based Methods for Object Recognition. Institute for Computer Graphics and Vision, Graz University of Technology, Austria, Technical Reports.

    Google Scholar 

  57. Kragic, D., A. Miller, and P. Allen. 2001. Real-time Tracking Meets Online Grasp Planning. In IEEE International Conference on Robotics and Automation, vol. 3, 2460–2465.

    Google Scholar 

  58. Azad, P., T. Asfour, and R. Dillmann. 2006. Combining Appearance-based and Model-based Methods for Real-Time Object Recognition and 6D Localization. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 5339–5344.

    Google Scholar 

  59. Azad, P., T. Asfour, and R. Dillmann. 2009. Combining Harris Interest Points and the SIFT Descriptor for Fast Scale-Invariant Object Recognition. In IEEE-RSJ International Conference on Intelligent Robots and Systems, 4275–4280.

    Google Scholar 

  60. Wei-Hsuan C., H. Chih-Hsien, T. Yi-Che, C. Shih-Hung, Y. Fun, and C. Jen-Shiun. 2009. An Efficient Object Recognition System for Humanoid Robot Vision. In Joint Conferences on Pervasive Computing, 209–214.

    Google Scholar 

  61. Azad, P., D. Muench, T. Asfour, and R. Dillmann. 2011. 6-DoF Model-based Tracking of Arbitrarily Shaped 3D Objects. In IEEE International Conference on Robotics and Automation.

    Google Scholar 

  62. Davison, A. 1998. Mobile Robot Navigation Using Active Vision. Ph.D. dissertation, Robotics Research Group, Department of Engineering Science, University of Oxford.

    Google Scholar 

  63. Faugeras, O. 1995. Stratification of 3-D vision: Projective, Affine, and Metric Representations. Journal of the Optical Society of America A 12: 46 548–4.

    Google Scholar 

  64. Gonzalez-Aguirre, D., T. Asfour, and R. Dillmann. 2011. Towards Stratified Model-based Environmental Visual Perception for Humanoid Robots. Pattern Recognition Letters 32 (16): 2254–2260. (advances in Theory and Applications of Pattern Recognition, Image Processing and Computer Vision).

    Google Scholar 

  65. Russell, P.N.S. 1995. Artificial Intelligence: A Modern Approach. Prentice Hall Series in artificial intelligence. ISBN 9780136042594.

    Google Scholar 

  66. Daniilidis, K., and J.-O. Eklundh. 2008. 3-D Vision and Recognition. In Springer Handbook of Robotics, ed. B. Siciliano, and O. Khatib, 543–562. Berlin: Springer.

    Google Scholar 

  67. Gonzalez-Aguirre, D., T. Asfour, E. Bayro-Corrochano, and R. Dillmann. 2008. Model-based Visual Self-localization Using Geometry and Graphs. In International Conference on Pattern Recognition, 1–5.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Israel González Aguirre .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

González Aguirre, D.I. (2019). State-of-the-Art. In: Visual Perception for Humanoid Robots. Cognitive Systems Monographs, vol 38. Springer, Cham. https://doi.org/10.1007/978-3-319-97841-3_2

Download citation

Publish with us

Policies and ethics