Joint optimization based on direct sparse stereo visual-inertial odometry
This paper proposes a novel fusion of an inertial measurement unit (IMU) and stereo camera method based on direct sparse odometry (DSO) and stereo DSO. It jointly optimizes all model parameters within a sliding window, including the inverse depth of all selected pixels and the internal or external camera parameters of all keyframes. The vision part uses a photometric error function that optimizes 3D geometry and camera pose in a combined energy functional. The proposed algorithm uses image blocks to extract neighboring image features and directly forms measurement residuals in the image intensity space. A fixed-baseline stereo camera solves scale drift. IMU information is accumulated between several frames using manifold pre-integration and is inserted into the optimization as additional constraints between keyframes. The scale and gravity inserted are incorporated into the stereo visual inertial odometry model and are optimized together with other variables such as poses. The experimental results show that the tracking accuracy and robustness of the proposed method are superior to those of the state-of-the-art fused IMU method. In addition, compared with previous semi-dense direct methods, the proposed method displays a higher reconstruction density and scene recovery.
KeywordsDirect sparse odometry IMU pre-integration Sliding window optimization 3D reconstruction
The work is supported by the national Natural Science Foundation of China (Project Nos. 61673125, 61773333), China Scholarship Council (CSC, Project No. 201908130016).
- Bowman, S. L., Atanasov, N., Daniilidis, K., & Pappas G. J. (2017). Probabilistic data association for semantic SLAM. In IEEE international conference on robotics and automation (pp. 1722–1729).Google Scholar
- Comport, A. I., Malis, E., & Rives, P. (2007). Accurate quadrifocal tracking for robust 3D visual odometry. In IEEE international conference on robotics and automation (pp. 40–45).Google Scholar
- Engel, J., Koltun, V., & Cremers, D. (2016). Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP(99), 1–1.Google Scholar
- Engel, J., Schps, T., & Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular SLAM. Berlin: Springer.Google Scholar
- Engel, J., Stckler, J., & Cremers, D. (2015). Large-scale direct SLAM with stereo cameras. In IEEE/RSJ international conference on intelligent robots and systems (pp. 1935–1942). Google Scholar
- Engel, J., Sturm, J., & Cremers, D. (2013). Semi-dense visual odometry for a monocular camera. In IEEE international conference on computer vision (pp. 1449–1456). Google Scholar
- Engel, J., Sturm, J., & Cremers, D. (2012). Camera-based navigation of a low-cost quadrocopter. In IEEE/RSJ international conference on intelligent robots and systems.Google Scholar
- Engel, J., Usenko, V., Cremers, D. (2016). A photometrically calibrated benchmark for monocular visual odometry. arXiv:1607.02555 [cs.CV].
- Forster, C., Pizzoli, M.,& Scaramuzza, D. (2014). SVO: Fast semi-direct monocular visual odometry. In IEEE international conference on robotics and automation (pp. 15–22).Google Scholar
- Kerl, C., Sturm, J., & Cremers, D. (2013). Robust odometry estimation for RGB-D cameras. In IEEE international conference on robotics and automation (pp. 3748–3754).Google Scholar
- Klein, G., & Murray, D. (2008). Parallel tracking and mapping for small AR workspaces. In IEEE and ACM international symposium on mixed and augmented reality (pp. 1–10).Google Scholar
- Mahmoud, N., Cirauqui, I., Hostettler, A., Doignon, C., Soler, L., Marescaux, J., Montiel, M. M. (2016). ORBSLAM-based endoscope tracking and 3D reconstruction. arXiv:1608.08149 [cs.CV]
- Meier, L., Tanskanen, P., Fraundorfer, F., & Pollefeys, M. (2012). The Pixhawk open-source computer vision framework for mavs. ISPRS - International Archives of the Photogrammetry, XXXVIII–1/C22, 13–18.Google Scholar
- Newcombe, R. A., Lovegrove, S. J., & Davison, A. J. (2010). DTAM: Dense tracking and mapping in real-time. In International conference on computer vision (pp. 2320–2327).Google Scholar
- Qin, T., Li, P., & Shen, S. (2017). VINS-Mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, PP(99), 1–17.Google Scholar
- Stumberg, L. V., Usenko, V., & Cremers, D. (2018). Direct sparse visual-inertial odometry using dynamic marginalization. arXiv:1804.05625 [cs.CV].
- Urtasun, R., Lenz, P., & Geiger, A. (2012). Are we ready for autonomous driving? The KITTI vision benchmark suite. In IEEE conference on computer vision and pattern recognition (pp. 3354–3361).Google Scholar
- Usenko, V., Engel, J., Stckler, J., & Cremers, D. (2016). Direct visual-inertial odometry with stereo cameras. In IEEE international conference on robotics and automation (pp. 1885–1892).Google Scholar
- Wang, R., Schworer, M., & Cremers, D. (2017). Stereo DSO: Large-scale direct sparse visual odometry with stereo cameras (pp. 3923–3931). arXiv:1708.07878 [cs.CV].