Skip to main content
Log in

Depth Estimation with Ego-Motion Assisted Monocular Camera

  • Published:
Gyroscopy and Navigation Aims and scope Submit manuscript

Abstract

We propose a method to estimate the distance to objects based on the complementary nature of monocular image sequences and camera kinematic parameters. The fusion of camera measurements with the kinematics parameters that are measured by an IMU and an odometer is performed using an extended Kalman filter. Results of field experiments with a wheeled robot corroborated the results of the simulation study in terms of accuracy of depth estimation. The performance of the approach in depth estimation is strongly affected by the mutual observer and feature point geometry, measurement accuracy of the observer’s motion parameters and distance covered by the observer. It was found that under favorable conditions the error in distance estimation can be as small as 1% of the distance to a feature point. This approach can be used to estimate distance to objects located hundreds of meters away from the camera.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.
Fig. 11.
Fig. 12.

Similar content being viewed by others

REFERENCES

  1. R. Hartley and A. Zisserman, Multiple view geometry in computer vision. Cambridge University Press, 2003.

    MATH  Google Scholar 

  2. O. Özye¸sil, V. Voroninski, R. Basri, and A. Singer, “A survey of structure from motion,” Acta Numerica, vol. 26, pp. 305–364, 2017.

    Article  MathSciNet  Google Scholar 

  3. L. Matthies, R. Szeliski, and T. Kanade, “Incremental estimation of dense depth maps from image sequences,” in Proc. Computer Society Conf. on Computer Vision and Pattern Recognition, CVPR’88, pp. 366–374, IEEE, 1988.

  4. L. Matthies, Dynamic stereo vision. PhD thesis, Carnegie Mellon University, 1989.

  5. J. Hernandez, K. Tsotsos, and S. Soatto, “Observability, identifiability and sensitivity of vision-aided inertial navigation,” in Proc. IEEE Int. Conf. Robotics and Automation (ICRA), pp. 2319–2325, IEEE, 2015.

  6. H. C. Longuet-Higgins and K. Prazdny, “The interpretation of a moving retinal image,” Proc. R. Soc. Lond. B, vol. 208, no. 1173, pp. 385–397, 1980.

    Article  Google Scholar 

  7. P. Corke, “An inertial and visual sensing system for a small autonomous helicopter,” Journal of Field Robotics, vol. 21, no. 2, pp. 43–51, 2004.

    Google Scholar 

  8. D. Strelow and S. Singh, “Online motion estimation from image and inertial measurements,” in Proc. Workshop on Integration of Vision and Inertial Sensors (INERVIS), Coimbra, Portugal, June 2003.

  9. D. Strelow and S. Singh, “Motion estimation from image and inertial measurements,” The International Journal of Robotics Research, vol. 23, no. 12, pp. 1157–1195, 2004.

    Article  Google Scholar 

  10. P. Corke, J. Lobo, and J. Dias, “An introduction to inertial and visual sensing,” The International Journal of Robotics Research, vol. 26, no. 6, pp. 519–535, 2007.

    Article  Google Scholar 

  11. V. Grabe, H. H. Bulthoff, and P. R. Giordano, “A comparison of scale estimation schemes for a quadrotor UAV based on optical flow and IMU measurements,” in Proc. Int. Conf. Intelligent Robots and Systems (IROS), pp. 5193–5200, IEEE, 2013.

  12. J. Delmerico and D. Scaramuzza, “A benchmark comparison of monocular visual-inertial odometry algorithms for flying robots,” Memory, vol. 10, p. 20, 2018.

    Google Scholar 

  13. A. Huster and S. M. Rock, “Relative position sensing by fusing monocular vision and inertial rate sensors,” in Proc. 11th Int. Conf. Advanced Robotics (ICAR’03), Coimbra, Portugal, vol. 3, pp. 1562–1567, IEEE, 2003.

  14. A. Huster, Relative position sensing by fusing monocular vision and inertial rate sensors. PhD thesis, Stanford University, 2003.

  15. A. Dani, N. Fischer, Z. Kan, and W. Dixon, “Globally exponentially stable observer for vision-based range estimation,” Mechatronics, vol. 22, no. 4, pp. 381–389, 2012.

    Article  Google Scholar 

  16. R. Spica, P. R. Giordano, and F. Chaumette, “Plane estimation by active vision from point features and image moments,” in Proc. Int. Conf. Robotics and Automation (ICRA), pp. 6003–6010, IEEE, 2015.

  17. R. Spica, P. R. Giordano, and F. Chaumette, “Active structure from motion: Application to point, sphere, and cylinder,” IEEE Transactions on Robotics, vol. 30, no. 6, pp. 1499–1513, 2014.

    Article  Google Scholar 

  18. O. Tahri, D. Boutat, and Y. Mezouar, “Brunovsky’s linear form of incremental structure from motion,” IEEE Transactions on Robotics, vol. 33, no. 6, pp. 1491–1499, 2017.

    Article  Google Scholar 

  19. A. Martinelli, “Vision and IMU data fusion: Closed-form solutions for attitude, speed, absolute scale, and bias determination,” IEEE Transactions on Robotics, vol. 28, no. 1, pp. 44–60, 2012.

    Article  Google Scholar 

  20. A. De Luca, G. Oriolo, and P. Robuffo Giordano, “Feature depth observation for image-based visual servoing: Theory and experiments,” The International Journal of Robotics Research, vol. 27, no. 10, pp. 1093–1116, 2008.

    Article  Google Scholar 

  21. P. Davidson, J.-P. Raunio, and R. Piché, “Monocular vision-based range estimation supported by proprioceptive motion,” Gyroscopy and Navigation, vol. 8, no. 2, pp. 150–158, 2017.

    Article  Google Scholar 

  22. R. T. Azuma, “A survey of augmented reality,” Presence: Teleoperators & Virtual Environments, vol. 6, no. 4, pp. 355–385, 1997.

    Article  Google Scholar 

  23. M. Bloesch, S. Omari, M. Hutter, and R. Siegwart, “Robust visual inertial odometry using a direct EKF-based approach,” in Proc. Int. Conf. Intelligent Robots and Systems (IROS), pp. 298–304, IEEE, 2015.

  24. L. Matthies, T. Kanade, and R. Szeliski, “Kalman filter-based algorithms for estimating depth from image sequences,” International Journal of Computer Vision, vol. 3, no. 3, pp. 209–238, 1989.

    Article  Google Scholar 

  25. M. S. Landy, L. T. Maloney, E. B. Johnston, and M. Young, “Measurement and modeling of depth cue combination: In defense of weak fusion,” Vision research, vol. 35, no. 3, pp. 389–412, 1995.

    Article  Google Scholar 

  26. P. Corke, Robotics, vision and control: fundamental algorithms in MATLABR, vol. 73. Springer Science & Business Media, 2011.

    Book  Google Scholar 

  27. J. M. Montiel, J. Civera, and A. J. Davison, “Unified inverse depth parametrization for monocular SLAM,” in Proc. Robotics: Science and Systems, 2006.

  28. P. Davidson, J.-P. Raunio, and R. Piché, “Accurate depth estimation from a sequence of monocular images supported by proprioceptive sensors,” in Proc. 23rd Int. Conf. on Integrated Navigation Systems, St. Petersburg, Russia, pp. 249–257, 2016.

  29. P. Davidson, M. Mansour, O. A. Stepanov, and R. Piché, “Depth estimation from motion parallax: Experimental evaluation,” in Proc. 26th Int. Conf. on Integrated Navigation Systems, St. Petersburg, Russia, 2019.

  30. O. A. Stepanov, “Optimal and sub-optimal filtering in integrated navigation systems,” in In Aerospace Navigation Systems; Nebylov, A., Watson, J., Eds., pp. 392–446, John Wiley & Sons, Inc.: New York, NY, 2016.

    Google Scholar 

  31. S. Särkkä, Bayesian filtering and smoothing. Cambridge University Press, 2010.

    MATH  Google Scholar 

  32. Y. Oshman and P. Davidson, “Optimal observer trajectories for passive target localization using bearing-only measurements,” in Proc. AIAA Guidance, Navigation, and Control Conference, San-Diego, CA, p. 3740, 1996.

  33. Y. Oshman and P. Davidson, “Optimization of observer trajectories for bearings-only target localization,” IEEE Transactions on Aerospace and Electronic Systems, vol. 35, no. 3, pp. 892–902, 1999.

    Article  Google Scholar 

  34. J.-Y. Bouguet, Camera Calibration Toolbox for Matlab. California Institute of Technology, Pasadena, CA, 2006.

    Google Scholar 

  35. A. Geiger, F. Moosmann, O. Car, and B. Schuster, “Automatic camera and range sensor calibration using a single shot,” in Proc. IEEE Int. Conf. Robotics and Automation, pp. 3936–3943, 2012.

  36. J. Shi et al., “Good features to track,” in Proc. Computer Society Conference on Computer Vision and Pattern Recognition, pp. 593–600, IEEE, 1994.

  37. B. D. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” in Proc. of Imaging Understanding Workshop, Vancouver, BC, Canada, pp. 121–130, 1981.

  38. J.-Y. Bouguet, “Pyramidal implementation of the affine Lucas-Kanade feature tracker: description of the algorithm,” Intel Corporation, vol. 5, no. 1−10, p. 4, 2001.

  39. X.-S. Gao, X.-R. Hou, J. Tang, and H.-F. Cheng, “Complete solution classification for the perspective-three-point problem,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, pp. 930–943, Aug 2003.

    Article  Google Scholar 

Download references

ACKNOWLEDGMENT

The authors thank the staff of the TUT Mobile Robotics Lab. for their help in arranging the field experiments.

Funding

This work was partially supported by the Russian Foundation for Basic Research, project no. 18-08-01101A, and by the Government of the Russian Federation (grant 08-08).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Mansour.

APPENDIX

APPENDIX

This appendix provides derivation of the Jacobians (28), (29) for the motion model Equation (24).

$$\begin{gathered} {{A}_{{k - 1}}} = \frac{\partial }{{\partial X}}\Psi \mathop {\left( {{{X}_{{k - 1}}},\tilde {V}_{{{{z}_{{k - 1}}}}}^{c},\tilde {\omega }_{{{{y}_{{k - 1}}}}}^{c},{{q}_{{k - 1}}}} \right)}\nolimits_{X = {{{\hat {X}}}_{{k - 1}}},{{q}_{{k - 1}}} = 0} , \\ {{A}_{{k - 1}}} = \mathop {\left[ {\begin{array}{*{20}{c}} {1 - \frac{{2(x - {{c}_{x}})\tilde {\omega }_{y}^{c}\Delta t}}{f} + \tilde {V}_{z}^{c}\xi \Delta t}&0&{\tilde {V}_{z}^{c}(x - {{c}_{x}})\Delta t} \\ { - \tilde {\omega }_{y}^{c}(y - {{c}_{y}})\frac{{\Delta t}}{f}}&{1 + \tilde {V}_{z}^{c}\xi \Delta t - (x - {{c}_{x}})\tilde {\omega }_{y}^{c}\frac{{\Delta t}}{f}}&{\tilde {V}_{z}^{c}(y - {{c}_{y}})\Delta t} \\ { - \tilde {\omega }_{y}^{c}\xi \frac{{\Delta t}}{f}}&0&{1 + 2\tilde {V}_{z}^{c}\xi \Delta t - (x - {{c}_{x}})\tilde {\omega }_{y}^{c}\frac{{\Delta t}}{f}} \end{array}} \right]}\nolimits_{X = {{{\hat {X}}}_{{k - 1}}}} , \\ {{G}_{{k - 1}}} = \frac{\partial }{{\partial q}}\Psi \mathop {\left( {{{X}_{{k - 1}}},\tilde {V}_{{{{z}_{{k - 1}}}}}^{c},\tilde {\omega }_{{{{y}_{{k - 1}}}}}^{c},{{q}_{{k - 1}}}} \right)}\nolimits_{X = {{{\hat {X}}}_{{k - 1}}},{{q}_{{k - 1}}} = 0} , \\ {{G}_{{k - 1}}} = \mathop {\left[ {\begin{array}{*{20}{c}} {f + \frac{{{{{(x - {{c}_{x}})}}^{2}}}}{f}}&{ - \xi (x - {{c}_{x}})} \\ {\frac{{(x - {{c}_{x}})(y - {{c}_{y}})}}{f}}&{ - \xi (y - {{c}_{y}})} \\ {\frac{{\xi (x - {{c}_{x}})}}{f}}&{ - {{\xi }^{2}}} \end{array}} \right]}\nolimits_{X = {{{\hat {X}}}_{{k - 1}}}} . \\ \end{gathered} $$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mansour, M., Davidson, P., Stepanov, O. et al. Depth Estimation with Ego-Motion Assisted Monocular Camera. Gyroscopy Navig. 10, 111–123 (2019). https://doi.org/10.1134/S2075108719030064

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S2075108719030064

Keywords:

Navigation