Abstract
The problem of estimating and predicting position and orientation (pose) of a camera is approached by fusing measurements from inertial sensors (accelerometers and rate gyroscopes) and vision. The sensor fusion approach described in this contribution is based on non-linear filtering of these complementary sensors. This way, accurate and robust pose estimates are available for the primary purpose of augmented reality applications, but with the secondary effect of reducing computation time and improving the performance in vision processing. A real-time implementation of a multi-rate extended Kalman filter is described, using a dynamic model with 22 states, where 12.5 Hz correspondences from vision and 100 Hz inertial measurements are processed. An example where an industrial robot is used to move the sensor unit is presented. The advantage with this configuration is that it provides ground truth for the pose, allowing for objective performance evaluation. The results show that we obtain an absolute accuracy of 2 cm in position and 1° in orientation.
Similar content being viewed by others
References
Armesto, L., Tornero, J., Vincze, M.: Fast ego-motion estimation with multi-rate fusion of inertial and vision. Int. J. Robot. Res. 26(6), 577–589 (2007). doi:10.1177/0278364907079283
Aron, M., Simon, G., Berger, M.O.: Use of inertial sensors to support video tracking. Comput. Animat. Virtual. Worlds. 18(1), 57–68 (2007)
Bergman, N.: Recursive Bayesian estimation: Navigation and tracking applications. Dissertations no 579, Linköping Studies in Science and Technology, SE-581 83 Linköping, Sweden (1999)
Bucy, R.S., Senne, K.D.: Digital synthesis on nonlinear filters. Automatica 7, 287–298 (1971)
Chatfield, A.: Fundamentals of High Accuracy Inertial Navigation, vol. 174, 3rd edn. American Institute of Aeronautics and Astronautics, USA (1997)
Chroust, S.G., Vincze, M.: Fusion of vision and inertial data for motion and structure estimation. J. Robot. Syst. 21(2), 73–83 (2004)
Corke, P., Lobo, J., Dias, J.: An introduction to inertial and visual sensing. Int. J. Robot. Res. 26(6), 519–535 (2007). doi:10.1177/0278364907079279
Davison, A.J.: Real-time simultaneous localisation and mapping with a single camera. In: Proceedings of 9th IEEE International Conference on Computer Vision, Nice, France, vol. 2, pp. 1403–1410 (2003)
Davison, A.J.: Active search for real-time vision. In: Proceedings of 10th IEEE International Conference on Computer Vision, Beijing, China, pp. 66–73 (2005)
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007). doi:10.1109/TPAMI.2007.1049
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping (SLAM): Part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6):381–395 (1981). doi: 10.1145/358669.358692
Gemeiner, P., Einramhof, P., Vincze, M.: Simultaneous motion and structure estimation by fusion of inertial and vision data. Int. J. Robot. Res. 26(6), 591–605 (2007). doi:10.1177/0278364907080058
Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/non-gaussian bayesian state estimation. IEE Proc. Radar Signal Process. 140(2), 107–113 (1993)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, Manchester, UK, pp. 147–151 (1988)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)
Horn, B.K.P.: Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A 4(4), 629–642 (1987)
Isard, M., Blake, A.: Condensation—conditional density propagation for visual tracking. Int. J. Comput. Vis. 29(1):5–28 (1998). doi:10.1023/A:1008078328650
Jazwinski, A.H.: Stochastic processes and filtering theory. Mathematics in science and engineering, Academic, New York (1970)
Julier, S.J., Uhlmann, J.K.: Unscented filtering and nonlinear estimation, Proc. IEEE 92(3), 401–422 (2004). doi:10.1109/JPROC.2003.823141
Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME J Basic Eng. 82, 35–45 (1960)
Kitagawa, G.: Monte Carlo filter and smoother for non-Gaussian nonlinear state space models. J. Comput. Graph. Stat. 5(1), 1–25 (1996)
Klein, G.S.W., Drummond, T.W.: Tightly integrated sensor fusion for robust visual tracking. Image Vis. Comput. 22(10), 769–776 (2004)
Bartczak, B., Koeser, K., Woelk, F., Koch, R.: Extraction of 3D freeform surfaces as visual landmarks for real-time tracking. J. Real-Time Image Process. (this issue) doi:10.1007/s11554-007-0042-0
Kuipers, J.B.: Quaternions and Rotation Sequences. Princeton University Press, Princeton (1999)
Lobo, J., Dias, J.: Inertial sensed ego-motion for 3D vision. J. Robot. Syst. 21(1), 3–12 (2004)
Lobo, J., Dias, J.: Relative pose calibration between visual and inertial sensors. Int. J. Robot. Res. 26(6), 561–575 (2007). doi:10.1177/0278364907079276
Ma, Y., Soatto, S., Kosecka, J., Sastry, S.S.: An invitation to 3-D vision—from images to geometric models. Interdisciplinary Applied Mathematics, Springer, Berlin (2006)
Pieper, R.J.B.: Comparing estimation algorithms for camera position and orientation. Master’s thesis, Department of Electrical Engineering, Linköping University, Sweden (2007)
Pinies, P., Lupton, T., Sukkarieh, S., Tardos, J.D.: Inertial aiding of inverse depth SLAM using a monocular camera. In: Proceedings of IEEE International Conference on Robotics and Automation, Roma, Italy, pp. 2797–2802 (2007). doi:10.1109/ROBOT.2007.363895
Ribo, M., Brandner, M., Pinz, A.: A flexible software architecture for hybrid tracking, J. Robot. Syst. 21(2), 53–62 (2004)
Schmidt, S.F.: Application of state-space methods to navigation problems. Adv. Control Syst. 3, 293–340 (1966)
Schön, T.B.: Estimation of nonlinear dynamic systems—theory and applications. Dissertations no 998, Linköping Studies in Science and Technology, Department of Electrical Engineering, Linköping University, Sweden (2006)
Shuster, M.D.: A survey of attitude representations. J. Astronaut. Sci. 41(4):439–517 (1993)
Skoglund, J., Felsberg, M.: Covariance estimation for sad block matching. In: Proc. 15th Scandinavian Conference on Image Analysis (2007)
Smith, G.L., Schmidt, S.F., McGee, L.A.: Application of statistical filter theory to the optimal estimation of position and velocity on board a circumlunar vehicle. Tech. Rep. TR R-135, NASA (1962)
Stricker, D., Thomas, G.: The MATRIS project: real-time markerless camera tracking for AR and broadcast applications. J. Real-Time Image Processing. doi:10.1007/s11554-007-0041-1
Strobl, K.H., Hirzinger, G.: Optimal hand-eye calibration. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp. 4647–4653 (2006). doi:10.1109/IROS.2006.282250
Thrun S., Burgard W., Fox D. (2005) Probabilistic Robotics. Intelligent Robotics and Autonomous Agents. The MIT Press, Cambridge
Titterton, D.H., Weston, J.L.: Strapdown inertial navigation technology. IEE radar, sonar, navigation and avionics series. Peter Peregrinus, Stevenage (1997)
Williams, B., Smith, P., Reid, I.: Automatic relocalisation for a single-camera simultaneous localisation and mapping system. In: Proceedings of IEEE International Conference on Robotics and Automation, Roma, Italy, pp. 2784–2790 (2007). doi:10.1109/ROBOT.2007.363893
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000). doi:10.1109/34.888718
Acknowledgments
This work has been performed within the MATRIS consortium, which is a sixth framework research program within the European Union (EU), contract number: IST-002013.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hol, J.D., Schön, T.B., Luinge, H. et al. Robust real-time tracking by fusing measurements from inertial and vision sensors. J Real-Time Image Proc 2, 149–160 (2007). https://doi.org/10.1007/s11554-007-0040-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-007-0040-2