Advertisement

Multimedia Tools and Applications

, Volume 77, Issue 13, pp 16199–16222 | Cite as

Less restrictive camera odometry estimation from monocular camera

  • Zeyd BoukhersEmail author
  • Kimiaki Shirahama
  • Marcin Grzegorzek
Article
  • 154 Downloads

Abstract

This paper addresses the problem of estimating a camera motion from a non-calibrated monocular camera. Compared to existing methods that rely on restrictive assumptions, we propose a method which can estimate camera motion with much less restrictions by adopting new example-based techniques compensating the lack of information. Specifically, we estimate the focal length of the camera by referring to visually similar training images with which focal lengths are associated. For one step camera estimation, we refer to stationary points (landmark points) whose depths are estimated based on RGB-D candidates. In addition to landmark points, moving objects can be also used as an information source to estimate the camera motion. Therefore, our method simultaneously estimates the camera motion for a video, and the 3D trajectories of objects in this video by using Reversible Jump Markov Chain Monte Carlo (RJ-MCMC) particle filtering. Our method is evaluated on challenging datasets demonstrating its effectiveness and efficiency.

Keywords

Camera odometry RJ-MCMC particle filtering Trajectory extraction 

References

  1. 1.
    Boukhers Z, Shirahama K, Li F, Grzegorzek M (2015) Extracting 3d trajectories of objects from 2d videos using particle filter. In: International conference on multimedia retrieval (ICMR), pp 83–90Google Scholar
  2. 2.
    Buczko M, Willert V (2016) How to distinguish inliers from outliers in visual odometry for high-speed automotive applications IEEE symposium on intelligent vehicles (IV), pp 478–483Google Scholar
  3. 3.
    Cao L, Wang C, Li J (2015) Robust depth-based object tracking from a moving binocular camera. Signal Process 112:154–161CrossRefGoogle Scholar
  4. 4.
    Choi W, Savarese S (2010) Multiple target tracking in world coordinate with single, minimally calibrated camera, pages 553–567Google Scholar
  5. 5.
    Choi W, Pantofaru C, Savarese S (2013) A general framework for tracking multiple people from a moving camera. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(7):1577–1591CrossRefGoogle Scholar
  6. 6.
    Cvišié I, Petrović I (2015) Stereo odometry based on careful feature selection and tracking. In: European conference on mobile robots (ECMR), pp 1–6Google Scholar
  7. 7.
    Engel J, Sturm J, Cremers D (2013) Semi-dense visual odometry for a monocular camera. In: IEEE International conference on computer vision (ICCV), pp 1449–1456Google Scholar
  8. 8.
    Ess A, Leibe B, Schindler K, van Gool L (2008) A mobile vision system for robust multi-person tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1–8Google Scholar
  9. 9.
    Ess A, Leibe B, Schindler K, Van Gool L (2009) Robust multiperson tracking from a mobile platform. IEEE Trans Pattern Anal Mach Intell (PAMI) 31 (10):1831–1846CrossRefGoogle Scholar
  10. 10.
    Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645CrossRefGoogle Scholar
  11. 11.
    Frost DP, Kähler O. (2012) D. W. Murray. Object-aware bundle adjustment for correcting monocular scale drift. In: IEEE International conference on robotics and automation (ICRA), pp 4770–4776Google Scholar
  12. 12.
    Garcia J, Gardel A, Bravo I, Lazaro J, Martinez M (2013) Tracking people motion based on extended condensation algorithm. IEEE Trans Syst Man Cybern Syst 43(3):606–618CrossRefGoogle Scholar
  13. 13.
    Geiger A, Ziegler J, Stiller C (2011) Stereoscan: Dense 3d reconstruction in real-time. In: IEEE Symposium on intelligent vehicles, pp 963–968Google Scholar
  14. 14.
    Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 3354–3361Google Scholar
  15. 15.
    Grigorescu SM, Macesanu G, Cocias TT, Puiu D, Moldoveanu F (2011) Robust camera pose and scene structure analysis for service robotics. Robot Auton Syst 59(11):899–909CrossRefGoogle Scholar
  16. 16.
    Gutierrez-Gomez D, Mayol-Cuevas W, Guerrero J (2015) Inverse depth for accurate photometric and geometric error minimisation in rgb-d dense visual odometry. In: IEEE International conference on robotics and automation (ICRA), pp 83–89Google Scholar
  17. 17.
    Handa A, Whelan T, McDonald J, Davison A (2014) A benchmark for rgb-d visual odometry, 3d reconstruction and slam. In: IEEE International conference on robotics and automation (ICRA), pp 1524–1531Google Scholar
  18. 18.
    Hartley RI, Zisserman A (2004) Multiple view geometry in computer vision, Second edition. Cambridge University Press, Cambridge. ISBN: 0521540518CrossRefzbMATHGoogle Scholar
  19. 19.
    Hoiem D, Efros AA, Hebert M (2008) Putting objects in perspective. Int J Comput Vis 80(1):3–15CrossRefGoogle Scholar
  20. 20.
    Jafari O, Mitzel D, Leibe B (2014) Real-time rgb-d based people detection and tracking for mobile robots and head-worn cameras IEEE International conference on robotics and automation (ICRA), pp 5636–5643Google Scholar
  21. 21.
    Jaimez M, Gonzalez-Jimenez J (2015) Fast visual odometry for 3-d range sensors. IEEE Trans Robot 31(4):809–822CrossRefGoogle Scholar
  22. 22.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093
  23. 23.
    Karsch K, Liu C, Kang SB (2016) Depth transfer: depth extraction from videos using nonparametric sampling pages 173–205Google Scholar
  24. 24.
    Kerl C, Sturm J, Cremers D (2013) Robust odometry estimation for rgb-d cameras. In: IEEE International conference on robotics and automation (ICRA), pp 3748–3754Google Scholar
  25. 25.
    Kerl C, Stuckler J, Cremers D (2015) Dense continuous-time tracking and mapping with rolling shutter rgb-d cameras. In: IEEE International conference on computer vision (ICCV), pp 2264–2272Google Scholar
  26. 26.
    Khan Z, Balch T, Dellaert F (2005) Mcmc-based particle filtering for tracking a variable number of interacting targets. IEEE Trans Pattern Anal Mach Intell (PAMI) 27(11):1805–1819CrossRefGoogle Scholar
  27. 27.
    Liu R, Li Z, Jia J (2008) Image partial blur detection and classification. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1–8Google Scholar
  28. 28.
    Liu C, Yuen J, flow A. Torralba. (2011) Sift Dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell (PAMI) 33(5):978–994CrossRefGoogle Scholar
  29. 29.
    Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: International joint conference on artificial intelligence, pp 674–679Google Scholar
  30. 30.
    Micusik B, Pajdla T (2003) Estimation of omnidirectional camera model from epipolar geometry. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 1, pp 485–490Google Scholar
  31. 31.
    Mirabdollah MH, Mertsching B (2014) On the second order statistics of essential matrix elements. In: German conference on pattern recognition, pp 547–557Google Scholar
  32. 32.
    Mirabdollah H, Mertsching B (2015) Fast techniques for monocular visual odometry. In: German conference on pattern recognition (GCPR), pp 297–307Google Scholar
  33. 33.
    Morais E, Ferreira A, Cunha SA, Barros RM, Rocha A, Goldenstein S (2014) A multiple camera methodology for automatic localization and tracking of futsal players. Pattern Recogn Lett 39:21–30CrossRefGoogle Scholar
  34. 34.
    Nardi L, Bodin B, Zia MZ, Mawer J, Nisbet A, Kelly PHJ, Davison AJ, Luján M, O’Boyle MFP, Riley G, Topham N, Furber S (2015) Introducing slambench, a performance and accuracy benchmarking methodology for slam IEEE International conference on robotics and automation (ICRA), pp 5783–5790Google Scholar
  35. 35.
    Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175CrossRefzbMATHGoogle Scholar
  36. 36.
    Persson M, Piccini T, Mester R, Felsberg M (2015) Robust stereo visual odometry from monocular techniques. In: IEEE Symposium on intelligent vehicles (IV), pp 686–691Google Scholar
  37. 37.
    Rosten E, Drummond T (2005) Fusing points and lines for high performance tracking. In: IEEE International conference on computer vision (ICCV), pp 1508–1515Google Scholar
  38. 38.
    Saisan P, Medasani S, Owechko Y (2005) Multi-view classifier swarms for pedestrian detection and tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR) - Workshops, p 18Google Scholar
  39. 39.
    Salas-Moreno RF, Glocken B, Kelly PHJ, Davison AJ (2014) Dense planar slam. In: IEEE International symposium on mixed and augmented reality (ISMAR), pp 157–164Google Scholar
  40. 40.
    Saxena A, Sun M, Ng AY (2009) Make3d: learning 3d scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell (PAMI) 31(5):824–840CrossRefGoogle Scholar
  41. 41.
    Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3d. ACM Trans Graph 25(3):835–846CrossRefGoogle Scholar
  42. 42.
    Song S, Chandraker M (2014) Robust scale estimation in real-time monocular sfm for autonomous driving. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1566–1573Google Scholar
  43. 43.
    Song S, Xiao J, 233–240 (2013) Tracking revisited using rgbd camera: unified benchmark and baselines. In: IEEE International conference on computer vision (ICCV)Google Scholar
  44. 44.
    Vedaldi A, Fulkerson B (2008) VLFeat: an open and portable library of computer vision algorithms. http://www.vlfeat.org/
  45. 45.
    Wojek C, Walk S, Roth S, Schiele B (2011) Monocular 3d scene understanding with explicit occlusion reasoning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1993–2000Google Scholar
  46. 46.
    Wojek C, Walk S, Roth S, Schindler K, Schiele B (2013) Monocular visual scene understanding: Understanding multi-object traffic scenes. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(4):882–897CrossRefGoogle Scholar
  47. 47.
    Wu S, Oreifej O, Shah M (2011) Action recognition in videos acquired by a moving camera using motion decomposition of lagrangian particle trajectories. In: 2011 International conference on computer vision, pp 1419–1426Google Scholar
  48. 48.
    Xiang Y, Song C, Savarese S (2014) Monocular multiview object tracking with 3D aspect parts, pages 220–235Google Scholar
  49. 49.
    Xu C, Cetintas S, Lee K, Li L (2014) Visual sentiment prediction with deep convolutional neural networks. CoRRGoogle Scholar
  50. 50.
    Xue H, Liu Y, Cai D, He X (2016) Tracking people in rgbd videos using deep learning and motion clues. Neurocomputing 204:70–76CrossRefGoogle Scholar
  51. 51.
    Zhang J, Singh S (2015) Visual-lidar odometry and mapping: low drift, robust, and fast. In: IEEE International conference on robotics and automation(ICRA), pp 2174–2181Google Scholar
  52. 52.
    Zhang S, Yu X, Sui Y, Zhao S, Zhang L (2015) Object tracking with multi-view support vector machines. IEEE Trans Multimedia 17(3):265–278Google Scholar
  53. 53.
    Zhou Q-Y, Koltun V (2015) Depth camera tracking with contour cues. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 632–638Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Research Group for Pattern RecognitionUniversity of SiegenSiegenGermany
  2. 2.Faculty of Informatics and CommunicationUniversity of Economics in KatowiceKatowicePoland

Personalised recommendations