International Journal of Computer Vision

, Volume 78, Issue 1, pp 107–118 | Cite as

Mutual Information-Based 3D Object Tracking

Article

Abstract

We propose a robust methodology for 3D model-based markerless tracking of textured objects in monocular image sequences. The technique is based on mutual information maximization, a widely known criterion for multi-modal image registration, and employs an efficient multiresolution strategy in order to achieve robustness while keeping fast computational time, thus achieving near real-time performance for visual tracking of complex textured surfaces.

Keywords

Surface-image alignment Mutual information Nonlinear optimization B-Spline interpolation Multiresolution 3D tracking Template matching 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

References

  1. Baker, S., & Matthews, I. (2004). Lucas–Kanade 20 years on: a unifying framework. International Journal of Computer Vision, 56(3), 221–255. CrossRefGoogle Scholar
  2. Black, M. J., & Jepson, A. D. (1996). Eigentracking: robust matching and tracking of articulated objects using a view-based representation. In European conference on computer vision (Vol. 1, pp. 329–342). Google Scholar
  3. Brunelli, R., & Poggio, T. (1993). Face recognition: features versus templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(10), 1042–1052. CrossRefGoogle Scholar
  4. Cascia, M., Sclaroff, S., & Athitsos, V. (1999). Fast, reliable head tracking under varying illumination: an approach based on registration of texture-mapped 3d models. Google Scholar
  5. Cootes, T. F., Edwards, G. J., & Taylor, C. J. (1998). Active appearance models. Lecture Notes in Computer Science, 1407, 484–498. CrossRefGoogle Scholar
  6. Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. New York: Wiley. MATHGoogle Scholar
  7. Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley. MATHGoogle Scholar
  8. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395. CrossRefMathSciNetGoogle Scholar
  9. Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning. Reading: Addison–Wesley. MATHGoogle Scholar
  10. Gonzalez, R. C., & Woods, R. E. (2006). Digital image processing (3rd ed.). Upper Saddle River: Prentice-Hall. Google Scholar
  11. Gorodnichy, D., Malik, S., & Roth, G. (2002). Affordable 3d face tracking using projective vision. In International conference on vision interfaces (pp. 383–390). Google Scholar
  12. Hager, G. D., & Belhumeur, P. N. (1998). Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10), 1025–1039. CrossRefGoogle Scholar
  13. Huber, P. (1981). Robust statistics. New York: Wiley. MATHGoogle Scholar
  14. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of the IEEE international conference on neural networks (Vol. 4, pp. 1942–1948). Google Scholar
  15. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110. CrossRefGoogle Scholar
  16. Lu, L., Dai, X.-T., & Hager, G. (2004). A particle filter without dynamics for robust 3d face tracking. In Proceedings of the 2004 conference on computer vision and pattern recognition workshop (CVPRW’04) (Vol. 5, p. 70). Washington: IEEE Computer Society. CrossRefGoogle Scholar
  17. Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., & Suetens, P. (1997). Multimodality image registration by maximization of mutual information. IEEE Transactions on Medical Imaging, 16(2), 187–198. CrossRefGoogle Scholar
  18. Marquardt, D. W. (1963). An algorithm for least-squares estimation of nonlinear parameters. j-J-SIAM, 11(2), 431–441. MATHMathSciNetGoogle Scholar
  19. Matthews, I., & Baker, S. (2003). Active appearance models revisited (Technical Report CMU-RI-TR-03-02). Robotics Institute, Carnegie Mellon University. Google Scholar
  20. Nelder, J., & Mead, R. (1965). A simplex method for function minimization. Computer Journal, 7, 308–313. MATHGoogle Scholar
  21. Park, I. K., Zhang, H., Vezhnevets, V., & Choh, H.-K. (2004). Image-based photorealistic 3-d face modeling. In International conference on automatic face and gesture recognition (pp. 49–56). Google Scholar
  22. Pluim, J. P. W., Maintz, J. B. A., & Viergever, M. A. (2003). Mutual-information-based registration of medical images: a survey. IEEE Transactions on Medical Imaging, 22(8), 986–1004. CrossRefGoogle Scholar
  23. Principe, J., Xu, D., & Fisher, J. (1999). Information theoretic learning. In S. Haykin (Ed.), Unsupervised adaptive filtering. New York: Wiley. Google Scholar
  24. Shi, J., & Tomasi, C. (1994). Good features to track. In IEEE conference on computer vision and pattern recognition (CVPR’94), Seattle, June 1994. Google Scholar
  25. Skrypnyk, I., & Lowe, D. G. (2004). Scene modelling, recognition and tracking with invariant image features. In ISMAR ’04: proceedings of the third IEEE and ACM international symposium on mixed and augmented reality (ISMAR’04) (pp. 110–119), Washington, DC, USA. Los Alamitos: IEEE Computer Society. CrossRefGoogle Scholar
  26. Thevenaz, P., & Unser, M. (2000). Optimization of mutual information for multiresolution image registration. IEEE Transactions on Image Processing, 9(12), 2083–2099. MATHCrossRefGoogle Scholar
  27. Toyama, K. (1998). Look, ma—no hands!’ hands-free cursor control with real-time 3d face tracking. In Proceedings of the workshop on perceptual using interfaces (PUI’98) (pp. 49–54), San Francisco. Google Scholar
  28. Toyama, K., & Hager, G. (1996). Incremental focus of attention for robust visual tracking. International Journal on Computer Vision, 35(1), 45–63. CrossRefGoogle Scholar
  29. Unser, M. (1999). Splines: a perfect fit for signal and image processing. IEEE Signal Processing Magazine, 16(6), 22–38. IEEE Signal Processing Society’s 2000 magazine award. CrossRefGoogle Scholar
  30. Unser, M., Aldroubi, A., & Eden, M. (1993). B-spline signal processing: part I: theory. IEEE Transactions on Signal Processing, 41(2), 821–833. MATHCrossRefGoogle Scholar
  31. Unser, M., Aldroubi, A., & Eden, M. (1993). B-spline signal processing, part II: efficient design and applications. IEEE Transactions on Signal Processing, 41(2), 834–848. MATHCrossRefGoogle Scholar
  32. Unser, M., Aldroubi, A., & Eden, M. (1993). The L 2-polynomial spline pyramid. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(4), 364–379. CrossRefGoogle Scholar
  33. Vacchetti, L., & Lepetit, V. (2004). Stable real-time 3d tracking using online and offline information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1385–1391. CrossRefGoogle Scholar
  34. Viola, P. A., & Jones, M. J. (2001). Robust real-time face detection. In International conference on computer vision (p. 747). Google Scholar
  35. Wells, W., Viola, P., Atsumi, H., Nakajima, S., & Kikinis, R. (1996). Multi-modal volume registration by maximization of mutual information. Google Scholar
  36. Xiao, J., Baker, S., Matthews, I., & Kanade, T. (2004). Real-time combined 2d + 3d active appearance models. In CVPR (pp. 535–542). Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Chair for Robotics and Embedded SystemsTechnical University of MunichGarching bei MuenchenGermany

Personalised recommendations