Skip to main content
Log in

Robust real-time tracking by fusing measurements from inertial and vision sensors

  • Special Issue
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

The problem of estimating and predicting position and orientation (pose) of a camera is approached by fusing measurements from inertial sensors (accelerometers and rate gyroscopes) and vision. The sensor fusion approach described in this contribution is based on non-linear filtering of these complementary sensors. This way, accurate and robust pose estimates are available for the primary purpose of augmented reality applications, but with the secondary effect of reducing computation time and improving the performance in vision processing. A real-time implementation of a multi-rate extended Kalman filter is described, using a dynamic model with 22 states, where 12.5 Hz correspondences from vision and 100 Hz inertial measurements are processed. An example where an industrial robot is used to move the sensor unit is presented. The advantage with this configuration is that it provides ground truth for the pose, allowing for objective performance evaluation. The results show that we obtain an absolute accuracy of 2 cm in position and 1° in orientation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Armesto, L., Tornero, J., Vincze, M.: Fast ego-motion estimation with multi-rate fusion of inertial and vision. Int. J. Robot. Res. 26(6), 577–589 (2007). doi:10.1177/0278364907079283

    Article  Google Scholar 

  2. Aron, M., Simon, G., Berger, M.O.: Use of inertial sensors to support video tracking. Comput. Animat. Virtual. Worlds. 18(1), 57–68 (2007)

    Article  Google Scholar 

  3. Bergman, N.: Recursive Bayesian estimation: Navigation and tracking applications. Dissertations no 579, Linköping Studies in Science and Technology, SE-581 83 Linköping, Sweden (1999)

  4. Bucy, R.S., Senne, K.D.: Digital synthesis on nonlinear filters. Automatica 7, 287–298 (1971)

    Article  MATH  Google Scholar 

  5. Chatfield, A.: Fundamentals of High Accuracy Inertial Navigation, vol. 174, 3rd edn. American Institute of Aeronautics and Astronautics, USA (1997)

  6. Chroust, S.G., Vincze, M.: Fusion of vision and inertial data for motion and structure estimation. J. Robot. Syst. 21(2), 73–83 (2004)

    Article  Google Scholar 

  7. Corke, P., Lobo, J., Dias, J.: An introduction to inertial and visual sensing. Int. J. Robot. Res. 26(6), 519–535 (2007). doi:10.1177/0278364907079279

    Article  Google Scholar 

  8. Davison, A.J.: Real-time simultaneous localisation and mapping with a single camera. In: Proceedings of 9th IEEE International Conference on Computer Vision, Nice, France, vol. 2, pp. 1403–1410 (2003)

  9. Davison, A.J.: Active search for real-time vision. In: Proceedings of 10th IEEE International Conference on Computer Vision, Beijing, China, pp. 66–73 (2005)

  10. Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007). doi:10.1109/TPAMI.2007.1049

    Article  Google Scholar 

  11. Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping (SLAM): Part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)

    Article  Google Scholar 

  12. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6):381–395 (1981). doi: 10.1145/358669.358692

  13. Gemeiner, P., Einramhof, P., Vincze, M.: Simultaneous motion and structure estimation by fusion of inertial and vision data. Int. J. Robot. Res. 26(6), 591–605 (2007). doi:10.1177/0278364907080058

    Article  Google Scholar 

  14. Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/non-gaussian bayesian state estimation. IEE Proc. Radar Signal Process. 140(2), 107–113 (1993)

    Article  Google Scholar 

  15. Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, Manchester, UK, pp. 147–151 (1988)

  16. Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)

  17. Horn, B.K.P.: Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A 4(4), 629–642 (1987)

    Article  MathSciNet  Google Scholar 

  18. Isard, M., Blake, A.: Condensation—conditional density propagation for visual tracking. Int. J. Comput. Vis. 29(1):5–28 (1998). doi:10.1023/A:1008078328650

    Google Scholar 

  19. Jazwinski, A.H.: Stochastic processes and filtering theory. Mathematics in science and engineering, Academic, New York (1970)

    Google Scholar 

  20. Julier, S.J., Uhlmann, J.K.: Unscented filtering and nonlinear estimation, Proc. IEEE 92(3), 401–422 (2004). doi:10.1109/JPROC.2003.823141

    Article  Google Scholar 

  21. Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME J Basic Eng. 82, 35–45 (1960)

    Google Scholar 

  22. Kitagawa, G.: Monte Carlo filter and smoother for non-Gaussian nonlinear state space models. J. Comput. Graph. Stat. 5(1), 1–25 (1996)

    Article  MathSciNet  Google Scholar 

  23. Klein, G.S.W., Drummond, T.W.: Tightly integrated sensor fusion for robust visual tracking. Image Vis. Comput. 22(10), 769–776 (2004)

    Article  Google Scholar 

  24. Bartczak, B., Koeser, K., Woelk, F., Koch, R.: Extraction of 3D freeform surfaces as visual landmarks for real-time tracking. J. Real-Time Image Process. (this issue) doi:10.1007/s11554-007-0042-0

  25. Kuipers, J.B.: Quaternions and Rotation Sequences. Princeton University Press, Princeton (1999)

  26. Lobo, J., Dias, J.: Inertial sensed ego-motion for 3D vision. J. Robot. Syst. 21(1), 3–12 (2004)

    Article  Google Scholar 

  27. Lobo, J., Dias, J.: Relative pose calibration between visual and inertial sensors. Int. J. Robot. Res. 26(6), 561–575 (2007). doi:10.1177/0278364907079276

    Article  Google Scholar 

  28. Ma, Y., Soatto, S., Kosecka, J., Sastry, S.S.: An invitation to 3-D vision—from images to geometric models. Interdisciplinary Applied Mathematics, Springer, Berlin (2006)

  29. Pieper, R.J.B.: Comparing estimation algorithms for camera position and orientation. Master’s thesis, Department of Electrical Engineering, Linköping University, Sweden (2007)

  30. Pinies, P., Lupton, T., Sukkarieh, S., Tardos, J.D.: Inertial aiding of inverse depth SLAM using a monocular camera. In: Proceedings of IEEE International Conference on Robotics and Automation, Roma, Italy, pp. 2797–2802 (2007). doi:10.1109/ROBOT.2007.363895

  31. Ribo, M., Brandner, M., Pinz, A.: A flexible software architecture for hybrid tracking, J. Robot. Syst. 21(2), 53–62 (2004)

    Article  Google Scholar 

  32. Schmidt, S.F.: Application of state-space methods to navigation problems. Adv. Control Syst. 3, 293–340 (1966)

    Google Scholar 

  33. Schön, T.B.: Estimation of nonlinear dynamic systems—theory and applications. Dissertations no 998, Linköping Studies in Science and Technology, Department of Electrical Engineering, Linköping University, Sweden (2006)

  34. Shuster, M.D.: A survey of attitude representations. J. Astronaut. Sci. 41(4):439–517 (1993)

    MathSciNet  Google Scholar 

  35. Skoglund, J., Felsberg, M.: Covariance estimation for sad block matching. In: Proc. 15th Scandinavian Conference on Image Analysis (2007)

  36. Smith, G.L., Schmidt, S.F., McGee, L.A.: Application of statistical filter theory to the optimal estimation of position and velocity on board a circumlunar vehicle. Tech. Rep. TR R-135, NASA (1962)

  37. Stricker, D., Thomas, G.: The MATRIS project: real-time markerless camera tracking for AR and broadcast applications. J. Real-Time Image Processing. doi:10.1007/s11554-007-0041-1

  38. Strobl, K.H., Hirzinger, G.: Optimal hand-eye calibration. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp. 4647–4653 (2006). doi:10.1109/IROS.2006.282250

  39. Thrun S., Burgard W., Fox D. (2005) Probabilistic Robotics. Intelligent Robotics and Autonomous Agents. The MIT Press, Cambridge

    Google Scholar 

  40. Titterton, D.H., Weston, J.L.: Strapdown inertial navigation technology. IEE radar, sonar, navigation and avionics series. Peter Peregrinus, Stevenage (1997)

    Google Scholar 

  41. Williams, B., Smith, P., Reid, I.: Automatic relocalisation for a single-camera simultaneous localisation and mapping system. In: Proceedings of IEEE International Conference on Robotics and Automation, Roma, Italy, pp. 2784–2790 (2007). doi:10.1109/ROBOT.2007.363893

  42. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000). doi:10.1109/34.888718

    Article  Google Scholar 

Download references

Acknowledgments

This work has been performed within the MATRIS consortium, which is a sixth framework research program within the European Union (EU), contract number: IST-002013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeroen D. Hol.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hol, J.D., Schön, T.B., Luinge, H. et al. Robust real-time tracking by fusing measurements from inertial and vision sensors. J Real-Time Image Proc 2, 149–160 (2007). https://doi.org/10.1007/s11554-007-0040-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-007-0040-2

Keywords

Navigation