We propose a novel edge-based visual–inertial fusion approach to address the problem of tracking aggressive motions with real-time state estimates. At the front-end, our system performs edge alignment, which estimates the relative poses in the distance transform domain with a larger convergence basin and stronger resistance to changing lighting conditions or camera exposures compared to the popular direct dense tracking. At the back-end, a sliding-window optimization-based framework is applied to fuse visual and inertial measurements. We utilize efficient inertial measurement unit (IMU) preintegration and two-way marginalization to generate accurate and smooth estimates with limited computational resources. To increase the robustness of our proposed system, we propose to perform an edge alignment self check and IMU-aided external check. Extensive statistical analysis and comparison are presented to verify the performance of our proposed approach and its usability with resource-constrained platforms. Comparing to state-of-the-art point feature-based visual–inertial fusion methods, our approach achieves better robustness under extreme motions or low frame rates, at the expense of slightly lower accuracy in general scenarios. We release our implementation as open-source ROS packages.
This is a preview of subscription content, log in to check access.
Buy single article
Instant unlimited access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Baker, S., & Matthews, I. (2004). Lucas–Kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(3), 221–255.
Bay, H., Tuytelaars, T., Ess, A., & Gool, L. V. (2008). Speeded up robust features. In Computer vision and image understanding.
Bloesch, M., Omari, S., Hutter, M., & Roland, S. (2015). Robust visual inertial odometry using a direct EKF-based approach. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems.
Burri, M., Nikolic, J., Gohl, P., Schneider, T., Rehder, J., Omari, S., et al. (2016). The EuRoC micro aerial vehicle datasets. The International Journal of Robotics Research, 35(10), 1157–1163.
Christian, F., Luca, C., Frank, D., & Davide, S. (2015). IMU preintegration on manifold for efficient visual–inertial maximum-a-posteriori estimation. In Proceedings of the robotics: Science and system.
Dong-Si, T., & Mourikis, A. I. (2012). Estimator initialization in vision-aided inertial navigation with unknown camera-IMU calibration. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.
Engel, J., Schöps, T., & Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular SLAM. In European conference on computer vision.
Engel, J., Sturm, J., & Cremers, D. (2013). Semi-dense visual odometry for a monocular camera. In: Proceedings of the IEEE international conference computer vision, Sydney.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2012). Distance transforms of sampled functions. Theory of Computing, 8(1), 415–428.
Fitzgibbon, A. (2003). Robust registration of 2D and 3D point sets. Image and Vision Computing, 21(14), 1145–1153.
Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The KITTI vision benchmark suite. In Conference on computer vision and pattern recognition.
Harris, C. G., & Pike, J. M. (1987). 3D positional integration from image sequences. In Proceedings of the Alvey vision conference, Cambridge.
Heng, L., Lee, G. H., & Pollefeys, M. (2014). Self-calibration and visual SLAM with a multi-camera system on a micro aerial vehicle. In Proceedings of Robotics: Science and Systems. Berkeley, CA.
Hesch, J. A., Kottas, D. G., Bowman, S. L., & Roumeliotis, S. I. (2014). Consistency analysis and improvement of vision-aided inertial navigation. IEEE Transactions on Robotics, 30(1), 158–176.
Huang, A. S., Bachrach, A., Henry, P., Krainin, M., Maturana, D., Fox, D., & Roy, N. (2011). Visual odometry and mapping for autonomous flight using an RGB-D camera. In Proceedings of the international symposium of robotics research, Flagstaff, AZ.
Huang, G., Kaess, M., & Leonard, J. J. (2014). Towards consistent visual–inertial navigation. In Proceedings of the IEEE international conference on robotics and automation, Hong Kong.
Kerl, C., Sturm, J., & Cremers, D. (2013). Robust odometry estimation for RGB-D cameras. In Proceedings of the IEEE international conference on robotics and automation.
Kuse, M., & Shen, S. (2016). Robust camera motion estimation using direct edge alignment and sub-gradient method. In Proceedings of the IEEE international conference on robotics and automation.
Leutenegger, S., Furgale, P., Rabaud, V., Chli, M., Konolige, K., & Siegwart, R. (2015). Keyframe-based visual–inertial using nonlinear optimization. The International Journal of Robotics Research, 34(3), 314–334.
Li, M., & Mourikis, A. (2013). High-precision, consistent EKF-based visual-inertial odometry. The International Journal of Robotics, 32(6), 690–711.
Ling, Y., Liu, T., & Shen, S. (2016). Aggressive quadrotor flight using dense visual–inertial fusion. In Proceedings of the IEEE international conference on robotics and automation.
Ling, Y., & Shen, S. (2015). Dense visual–inertial odometry for tracking of aggressive motions. In Proceedings of the IEEE international conference on robotics and biomimetics.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. S. (2012). An invitation to 3-d vision: From images to geometric models (Vol. 26). Berlin: Springer.
Newcombe, R. A., Lovegrove, S., & Davison, A. J. (2011). DTAM: Dense tracking and mapping in real-time. In IEEE international conference on computer vision (pp. 2320–2327).
Omari, S., Bloesch, M., Gohl, P., & Siegwart, R. (2015). Dense visual–inertial navigation system for mobile robots. In Proceedings of the IEEE international conference on robotics and automation.
Rosten, E., & Drummond, T. (2006). Machine learning for high-speed corner detection. In IEEE conference on European conference on computer vision.
Rusinkiewicz, S., & Levoy, M. (2001). Efficient variants of the ICP algorithm. In International conference on 3-D imaging and modeling (pp. 145–152).
Scaramuzza, D., Achtelik, M., Doitsidis, L., Fraundorfer, F., Kosmatopoulos, E., Martinelli, A., et al. (2014). Vision-controlled micro flying robots: From system design to autonomous navigation and mapping in GPS-denied environments. IEEE Robotics & Automation Magazine, 21(3), 26–40.
Shi, J., & Tomasi, C. (1994). Good features to track. In IEEE conference on computer vision and pattern recognition.
Segal, A., Haehnel, D., & Thrun, S. (2005). Generalized-ICP. In Robotics: Science and systems.
Shen, S., Michael, N., & Kumar, V. (2015). Tightly-coupled monocular visual–inertial fusion for autonomous flight of rotorcraft MAVs. In Proceedings of the IEEE international conference on robotics and automation, Seattle, WA.
Shen, S., Mulgaonkar, Y., Michael, N., & Kumar, V. (2013). Vision-based state estimation and trajectory control towards high-speed flight with a quadrotor. In Proceedings of robotics: Science and systems, Berlin.
Shen, S., Mulgaonkar, Y., Michael, N., & Kumar, V. (2014). Initialization-free monocular visual–inertial estimation with application to autonomous MAVs. In Proceedings of the international symposium on experimental robotics, Morocco.
Sibley, G., Matthies, L., & Sukhatme, G. (2010). Sliding window filter with application to planetary landing. Journal of Field Robotics, 27(5), 587–608.
Stückler, J., & Behnke, S. (2012). Model learning and real-time tracking using multi-resolution surfel maps. In Association for the advancement of artificial intelligence.
Tomasi, C., & Kanade, T. (1991). Detection and tracking of point features. In Carnegie Mellon University Technical Report CMU-CS-91-132.
Usenko, V., Engel, J., Stuckler, J., & Cremers, D. (2016). Direct visual–inertial odometry with stereo cameras. In Proceedings of the IEEE international conference on robotics and automation.
Yang, Z., & Shen, S. (2015). Monocular visual–inertial fusion with online initialization and camera-IMU calibration. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.
Yang, Z., & Shen, S. (2016). Tightly-coupled visual–inertial sensor fusion based on IMU pre-integration, Technical report. Hong Kong University of Science and Technology. http://www.ece.ust.hk/~eeshaojie/vins2016zhenfei.pdf.
The authors acknowledge the funding support from HKUST internal Grant R9341 and HKUST institutional scholarship. We would like to thank all AURO reviewers for their exceptionally useful reviews.
About this article
Cite this article
Ling, Y., Kuse, M. & Shen, S. Edge alignment-based visual–inertial fusion for tracking of aggressive motions. Auton Robot 42, 513–528 (2018). https://doi.org/10.1007/s10514-017-9642-0
- Visual–inertial fusion
- Edge alignment
- Tracking of aggressive motions
- Visual–inertial odometry