3-D Vision for Navigation and Grasping

  • Danica KragicEmail author
  • Kostas Daniilidis
Part of the Springer Handbooks book series (SHB)


In this chapter, we describe algorithms for three-dimensional (3-D ) vision that help robots accomplish navigation and grasping. To model cameras, we start with the basics of perspective projection and distortion due to lenses. This projection from a 3-D world to a two-dimensional (2-D ) image can be inverted only by using information from the world or multiple 2-D views. If we know the 3-D model of an object or the location of 3-D landmarks, we can solve the pose estimation problem from one view. When two views are available, we can compute the 3-D motion and triangulate to reconstruct the world up to a scale factor. When multiple views are given either as sparse viewpoints or a continuous incoming video, then the robot path can be computer and point tracks can yield a sparse 3-D representation of the world. In order to grasp objects, we can estimate 3-D pose of the end effector or 3-D coordinates of the graspable points on the object.


Point Cloud Projection Matrice Point Correspondence Bundle Adjustment World Coordinate System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.







global positioning system


inertial measurement unit


Markov random field




simultaneous localization and mapping


singular value decomposition


  1. 32.1
    S. Izadi, R.A. Newcombe, D. Kim, O. Hilliges, D. Molyneaux, S. Hodges, P. Kohli, J. Shotton, A.J. Davison, A. Fitzgibbon: Kinectfusion: Real-time dynamic 3D surface reconstruction and interaction, ACM SIGGRAPH 2011 Talks (2011) p. 23Google Scholar
  2. 32.2
    Google: Atap project tango, (2014)
  3. 32.3
    J.A. Hesch, D.G. Kottas, S.L. Bowman, S.I. Roumeliotis: Camera-IMU-based localization: Observability analysis and consistency improvement, Int. J. Robotics Res. 33(1), 182–201 (2014)CrossRefGoogle Scholar
  4. 32.4
    N. Snavely, S.M. Seitz, R. Szeliski: Modeling the world from internet photo collections, Int. J. Comput. Vis. 80(2), 189–210 (2008)CrossRefGoogle Scholar
  5. 32.5
    Z. Kukelova, M. Bujnak, T. Pajdla: Polynomial eigenvalue solutions to minimal problems in computer vision, IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1381–1393 (2012)CrossRefGoogle Scholar
  6. 32.6
    F. Kahl, S. Agarwal, M.K. Chandraker, D. Kriegman, S. Belongie: Practical global optimization for multiview geometry, Int. J. Comput. Vis. 79(3), 271–284 (2008)CrossRefGoogle Scholar
  7. 32.7
    R.I. Hartley, F. Kahl: Global optimization through rotation space search, Int. J. Comput. Vis. 82(1), 64–79 (2009)CrossRefGoogle Scholar
  8. 32.8
    Z. Zhang: A flexible new technique for camera calibration, IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000)CrossRefGoogle Scholar
  9. 32.9
    M. Pollefeys, L. Van Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, R. Koch: Visual modeling with a hand-held camera, Int. J. Comput. Vis. 59, 207–232 (2004)CrossRefGoogle Scholar
  10. 32.10
    M. Pollefeys, L. Van Gool: Stratified self-calibration with the modulus constraint, IEEE Trans. Pattern Anal. Mach. Intell. 21, 707–724 (1999)CrossRefGoogle Scholar
  11. 32.11
    O. Faugeras, Q.-T. Luong, T. Papadopoulo: The Geometry of Multiple Images: The Laws That Govern the Formation of Multiple Images of a Scene and Some of Their Applications (MIT Press, Cambridge 2001)zbMATHGoogle Scholar
  12. 32.12
    R. Hartley, A. Zisserman: Multiple View Geometry (Cambridge Univ. Press, Cambridge 2000)zbMATHGoogle Scholar
  13. 32.13
    K. Ottenberg, R.M. Haralick, C.-N. Lee, M. Nolle: Review and analysis of solutions of the three-point perspective problem, Int. J. Comput. Vis. 13, 331–356 (1994)CrossRefGoogle Scholar
  14. 32.14
    M.A. Fischler, R.C. Bolles: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography, ACM Commun. 24, 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  15. 32.15
    R. Kumar, A.R. Hanson: Robust methods for estimaging pose and a sensitivity analysis, Comput. Vis. Image Underst. 60, 313–342 (1994)CrossRefGoogle Scholar
  16. 32.16
    C.-P. Lu, G. Hager, E. Mjolsness: Fast and globally convergent pose estimation from video images, IEEE Trans. Pattern Anal. Mach. Intell. 22, 610–622 (2000)CrossRefGoogle Scholar
  17. 32.17
    L. Quan, Z. Lan: Linear n-point camera pose determination, IEEE Trans. Pattern Anal. Mach. Intell. 21, 774–780 (1999)CrossRefGoogle Scholar
  18. 32.18
    A. Ansar, K. Daniilidis: Linear pose estimation from points and lines, IEEE Trans. Pattern Anal. Mach. Intell. 25, 578–589 (2003)CrossRefzbMATHGoogle Scholar
  19. 32.19
    V. Lepetit, F. Moreno-Noguer, P. Fua: EPNP: An accurate $o(n)$ solution to the PNP problem, Int. J. Comput. Vis. 81(2), 155–166 (2009)CrossRefGoogle Scholar
  20. 32.20
    G.H. Golub, C.F. van Loan: Matrix Computations (Johns Hopkins Univ. Press, Baltimore 1983)zbMATHGoogle Scholar
  21. 32.21
    J.A. Hesch, S.I. Roumeliotis: A direct least-squares (dls) method for pnp, IEEE Int. Conf. Comput. Vis. (ICCV) (2011) pp. 383–390Google Scholar
  22. 32.22
    C.J. Taylor, D.J. Kriegman: Minimization on the Lie Group SO(3) and Related Manifolds (Yale University, New Haven 1994)Google Scholar
  23. 32.23
    P.-A. Absil, R. Mahony, R. Sepulchre: Optimization Algorithms on Matrix Manifolds (Princeton Univ. Press, Princeton 2009)zbMATHGoogle Scholar
  24. 32.24
    Y. Ma, J. Košecká, S. Sastry: Optimization criteria and geometric algorithms for motion and structure estimation, Int. J. Comput. Vis. 44(3), 219–249 (2001)CrossRefzbMATHGoogle Scholar
  25. 32.25
    R.I. Hartley, P. Sturm: Triangulation, Comput. Vis. Image Underst. 68(2), 146–157 (1997)CrossRefGoogle Scholar
  26. 32.26
    B. Kitt, A. Geiger, H. Lategahn: Visual odometry based on stereo image sequences with ransac-based outlier rejection scheme, IEEE Intell. Veh. Symp. (IV) (2010)Google Scholar
  27. 32.27
    B.K.P. Horn, H.M. Hilden, S. Negahdaripour: Closed-form solution of absolute orientation using orthonormal matrices, J. Opt. Soc. Am. A 5, 1127–1135 (1988)MathSciNetCrossRefGoogle Scholar
  28. 32.28
    A.J. Davison, I.D. Reid, N.D. Molton, O. Stasse: Monoslam: Real-time single camera SLAM, IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)CrossRefGoogle Scholar
  29. 32.29
    R. Tron, K. Daniilidis: On the quotient representation for the essential manifold, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (2014) pp. 1574–1581Google Scholar
  30. 32.30
    T.S. Huang, O.D. Faugeras: Some properties of the E matrix in two-view motion estimation, IEEE Trans. Pattern Anal. Mach. Intell. 11, 1310–1312 (1989)CrossRefGoogle Scholar
  31. 32.31
    D. Nister: An efficient solution for the five-point relative pose problem, IEEE Trans. Pattern Anal. Mach. Intell. 26, 756–777 (2004)CrossRefGoogle Scholar
  32. 32.32
    H. Li, R. Hartley: Five-point motion estimation made easy, IEEE 18th Int. Conf. Pattern Recognit. (ICPR), Vol. 1 (2006) pp. 630–633Google Scholar
  33. 32.33
    Z. Kukelova, M. Bujnak, T. Pajdla: Polynomial eigenvalue solutions to the 5-pt and 6-pt relative pose problems, BMVC (2008) pp. 1–10Google Scholar
  34. 32.34
    H. Stewenius, C. Engels, D. Nistér: Recent developments on direct relative orientation, ISPRS J. Photogramm. Remote Sens. 60(4), 284–294 (2006)CrossRefGoogle Scholar
  35. 32.35
    D. Batra, B. Nabbe, M. Hebert: An alternative formulation for five point relative pose problem, IEEE Workshop Motion Video Comput. (2007) pp. 21–21Google Scholar
  36. 32.36
    Center for Machine Perception, Minimal problems in computer vision;
  37. 32.37
    S. Maybank: Theory of Reconstruction from Image Motion (Springer, Berlin, Heidelberg 1993)CrossRefzbMATHGoogle Scholar
  38. 32.38
    S.J. Maybank: The projective geometry of ambiguous surfaces, Phil. Trans. Royal Soc. Lond. A 332(1623), 1–47 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 32.39
    A. Jepson, D.J. Heeger: A fast subspace algorithm for recovering rigid motion, Proc. IEEE Workshop Vis. Motion, Princeton (1991) pp. 124–131CrossRefGoogle Scholar
  40. 32.40
    C. Fermüller, Y. Aloimonos: Algorithmic independent instability of structure from motion, Proc. 5th Eur. Conf. Comput. Vision, Freiburg (1998)Google Scholar
  41. 32.41
    K. Daniilidis, M. Spetsakis: Understanding noise sensitivity in structure from motion. In: Visual Navigation, ed. by Y. Aloimonos (Lawrence Erlbaum, Mahwah 1996) pp. 61–88Google Scholar
  42. 32.42
    S. Soatto, R. Brockett: Optimal structure from motion: Local ambiguities and global estimates, IEEE Conf. Comput. Vis. Pattern Recognit., Santa Barbara (1998)Google Scholar
  43. 32.43
    J. Oliensis: A new structure-from-motion ambiguity, IEEE Trans. Pattern Anal. Mach. Intell. 22, 685–700 (1999)CrossRefGoogle Scholar
  44. 32.44
    O. Naroditsky, X.S. Zhou, J. Gallier, S. Roumeliotis, K. Daniilidis: Two efficient solutions for visual odometry using directional correspondence, IEEE Trans. Patterns Anal. Mach. Intell. (2012)Google Scholar
  45. 32.45
    Y. Ma, K. Huang, R. Vidal, J. Kosecka, S. Sastry: Rank conditions of the multiple view matrix, Int. J. Comput. Vis. 59(2), 115–139 (2004)CrossRefGoogle Scholar
  46. 32.46
    Y. Ma, S. Soatto, J. Kosecka, S. Sastry: An Invitation to 3-D Vision: From Images to Geometric Models (Springer, Berlin, Heidelberg 2003)zbMATHGoogle Scholar
  47. 32.47
    W. Triggs, P. McLauchlan, R. Hartley, A. Fitzgibbon: Bundle adjustment – A modern synthesis, Lect. Notes Comput. Sci 1883, 298–372 (2000)CrossRefGoogle Scholar
  48. 32.48
    M. Lourakis, A. Argyros: The Design and Implementation of a Generic Sparse Bundle Adjustment Software Package Based on the Levenberg–Marquard Method, Tech. Rep, Vol. 340 (ICS/FORTH, Heraklion 2004)Google Scholar
  49. 32.49
    S. Teller, M. Antone, Z. Bodnar, M. Bosse, S. Coorg: Calibrated, registered images of an extended urban area, Int. Conf. Comput. Vis. Pattern Recognit., Kauai, Vol. 1 (2001) pp. 813–820Google Scholar
  50. 32.50
    D. Kragic, M. Madry, D. Song: From object categories to grasp transfer using probabilistic reasoning, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2012) pp. 1716–1723Google Scholar
  51. 32.51
    A.T. Miller, S. Knoop, H.I. Christensen, P.K. Allen: Automatic grasp planning using shape primitives, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2003) pp. 1824–1829Google Scholar
  52. 32.52
    K. Hübner, D. Kragic: Selection of robot pre-grasps using box-based shape approximation, IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS) (2008) pp. 1765–1770Google Scholar
  53. 32.53
    C. Dunes, E. Marchand, C. Collowet, C. Leroux: Active rough shape estimation of unknown objects, IEEE Int. Conf. Intell. Robots Syst. (IROS) (2008) pp. 3622–3627Google Scholar
  54. 32.54
    M. Przybylski, T. Asfour: Unions of balls for shape approximation in robot grasping, IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), Taipei (2010) pp. 1592–1599Google Scholar
  55. 32.55
    C. Goldfeder, P.K. Allen, C. Lackner, R. Pelossof: Grasp Planning Via Decomposition Trees, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2007) pp. 4679–4684Google Scholar
  56. 32.56
    S. El-Khoury, A. Sahbani: Handling objects by their handles, IEEE/RSJ Int. Conf. Intell. Robots Syst. Workshop Grasp Task Learn. Imitation (2008)Google Scholar
  57. 32.57
    R. Pelossof, A. Miller, P. Allen, T. Jebera: An SVM learning approach to robotic grasping, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2004) pp. 3512–3518Google Scholar
  58. 32.58
    A. Boularias, O. Kroemer, J. Peters: Learning robot grasping from 3-d images with markov random fields, IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS) (2011) pp. 1548–1553Google Scholar
  59. 32.59
    R. Detry, E. Başeski, N. Krüger, M. Popović, Y. Touati, O. Kroemer, J. Peters, J. Piater: Learning object-specific grasp affordance densities, IEEE Int. Conf. Dev. Learn. (2009) pp. 1–7Google Scholar
  60. 32.60
    C. Papazov, S. Haddadin, S. Parusel, K. Krieger, D. Burschka: Rigid 3D geometry matching for grasping of known objects in cluttered scenes, Int. J. Robotics Res. 31(4), 538–553 (2012)CrossRefGoogle Scholar
  61. 32.61
    J. Weisz, P.K. Allen: Pose error robust grasping from contact wrench space metrics, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2012) pp. 557–562Google Scholar
  62. 32.62
    D. Song, C.H. Ek, K. Hübner, D. Kragic: Multivariate discretization for bayesian network structure learning in robot grasping, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2011) pp. 1944–1950Google Scholar
  63. 32.63
    Z.C. Marton, D. Pangercic, N. Blodow, J. Kleinehellefort, M. Beetz: General 3D modelling of novel objects from a single view, IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS) (2010) pp. 3700–3705Google Scholar
  64. 32.64
    D. Rao, V. Le Quoc, T. Phoka, M. Quigley, A. Sudsang, A.Y. Ng: Grasping novel objects with depth segmentation, IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), Taipei (2010) pp. 2578–2585Google Scholar
  65. 32.65
    J. Bohg, M. Johnson-Roberson, B. León, J. Felip, X. Gratal, N. Bergström, D. Kragic, A. Morales: Mind the gap – Robotic grasping under incomplete observation, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2011)Google Scholar
  66. 32.66
    G.M. Bone, A. Lambert, M. Edwards: Automated Modelling and Robotic Grasping of Unknown Three-Dimensional Objects, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2008) pp. 292–298Google Scholar
  67. 32.67
    K. Hsiao, S. Chitta, M. Ciocarlie, E.G. Jones: Contact-reactive grasping of objects with partial shape information, IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS) (2010) pp. 1228–1235Google Scholar
  68. 32.68
    M.A. Roa, M.J. Argus, D. Leidner, C. Borst, G. Hirzinger: Power grasp planning for anthropomorphic robot hands, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2012)Google Scholar
  69. 32.69
    M. Richtsfeld, M. Vincze: Grasping of Unknown Objects from a Table Top, ECCV Workshop Vis. Action: Effic. Strateg. Cogn. Agents Complex Environ. (2008)Google Scholar
  70. 32.70
    A. Maldonado, U. Klank, M. Beetz: Robotic grasping of unmodeled objects using time-of-flight range data and finger torque information, IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS) (2010) pp. 2586–2591Google Scholar
  71. 32.71
    J. Stückler, R. Steffens, D. Holz, S. Behnke: Real-time 3d perception and efficient grasp planning for everyday manipulation tasks, Eur. Conf. Mob. Robots (ECMR) (2011)Google Scholar
  72. 32.72
    G. Kootstra, M. Popovic, J.A. Jørgensen, K. Kuklinski, K. Miatliuk, D. Kragic, N. Kruger: Enabling grasping of unknown objects through a synergistic use of edge and surface information, Int. J. Robotics Res. 31(10), 1190–1213 (2012)CrossRefGoogle Scholar
  73. 32.73
    D. Kraft, N. Pugeault, E. Baseski, M. Popovic, D. Kragic, S. Kalkan, F. Wörgötter, N. Krueger: Birth of the object: Detection of objectness and extraction of object shape through object action complexes, Int. J. Humanoid Robotics pp, 247–265 (2009)Google Scholar
  74. 32.74
    O. Kroemer, E. Ugur, E. Oztop, J. Peters: A Kernel-based Approach to Direct Action Perception, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2012)Google Scholar
  75. 32.75
    A. Herzog, P. Pastor, M. Kalakrishnan, L. Righetti, T. Asfour, S. Schaal: Template-based learning of grasp selection, Proc. IEEE Int. Conf. Robotics Autom. (ICRA) (2012)Google Scholar
  76. 32.76
    L. Montesano, M. Lopes, A. Bernardino, J. Santos-Victor: Learning object affordances: From sensory–motor coordination to imitation, IEEE Trans. Robotics 24(1), 15–26 (2008)CrossRefGoogle Scholar
  77. 32.77
    O. Faugeras: Three-Dimensional Computer Vision (MIT Press, Cambridge 1993)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Centre for Autonomous SystemsRoyal Institute of Technology (KTH)StockholmSweden
  2. 2.Department of Computer and Information ScienceUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations