International Journal of Computer Vision

, Volume 24, Issue 2, pp 137–154

Alignment by Maximization of Mutual Information

  • Paul Viola
  • William M. Wells III


A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and may foreseeably be used in a wide variety of imaging situations.

Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images.

The method is based on a formulation of the mutual information between the model and the image. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Becker, S. and Hinton, G. E. 1992. Learning to make coherent predictions in domains with discontinuities. In Advances in Neural Information Processing, J. E. Moody, S. J. Hanson, and R. P. Lippmann, (Eds.), Denver 1991. Morgan Kaufmann: San Mateo, vol. 4.Google Scholar
  2. Bell, A. J. and Sejnowski, T. J. 1995. An information-maximisation approach to blind separation. In Advances in Neural Information Processing, Denver 1994. Morgan Kaufmann: San Francisco, vol. 4.Google Scholar
  3. Besl, P. and Jain, R. 1985. Three-dimensional object recognition. Computing Surveys, 17:75-145.Google Scholar
  4. Bridle, J. S. 1989. Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. In Advances in Neural Information Processing 2, D. S. Touretzky (Ed.), Morgan Kaufman, pp. 211-217.Google Scholar
  5. Chin, R. and Dyer, C. 1986. Model-based recognition in robot vision. Computing Surveys, 18:67-108.Google Scholar
  6. Collignon, A., Vandermuelen, D., Suetens, P., and Marchal, G. 1995. 3D multi-modality medical image registration using feature space clustering. In Computer Vision, Virtual Reality and Robotics in Medicine, N. Ayache (Ed.), Springer Verlag, pp. 195-204.Google Scholar
  7. Cover, T. M. and Thomas, J. A. 1991. Elements of Information Theory. John Wiley and Sons.Google Scholar
  8. Duda, R. and Hart, P. 1973. Pattern Classification and Scene Analysis. John Wiley and Sons.Google Scholar
  9. Haykin, S. 1994. Neural Networks: A comprehensive foundation. Macmillan College Publishing.Google Scholar
  10. Hill, D. L., Studholme, C., and Hawkes, D. J. 1994. Voxel similarity measures for automated image registration. In Proceedings of the Third Conference on Visualization in Biomedical Computing, pp. 205-216, SPIE.Google Scholar
  11. Horn, B. 1986. Robot Vision. McGraw-Hill: New York.Google Scholar
  12. Huttenlocher, D., Kedem, K., Sharir, K., and Sharir, M. 1991. The upper envelope of Voronoi surfaces and its applications. In Proceedings of the Seventh ACM Symposium on Computational Geometry, pp. 194-293.Google Scholar
  13. Linsker, R. 1986. From basic network principles to neural architecture. Proceedings of the National Academy of Sciences, USA, vol. 83, pp. 7508-7512, 8390-8394, 8779-8783.Google Scholar
  14. Ljung, L. and Süderstrüm, T. 1983. Theory and Practice of Recursive Identification. MIT Press.Google Scholar
  15. Lowe, D. 1985. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers.Google Scholar
  16. Shashua, A. 1992. Geometry and Photometry in 3D Visual Recognition. Ph. D. thesis, M. I. T Artificial Intelligence Laboratory, AITR-1401.Google Scholar
  17. Turk, M. and Pentland, A. 1991. Face recognition using eigenfaces. In Proceedings of the Computer Society Conference on Computer Vision and Pattern Recognition, Lahaina, Maui, Hawaii, pp. 586- 591. IEEE.Google Scholar
  18. Viola, P. A. 1995. Alignment by Maximization of Mutual Information. Ph. D. thesis, Massachusetts Institute of Technology.Google Scholar
  19. Wells III, W. 1992. Statistical Object Recognition. Ph. D. thesis, MIT Department Electrical Engineering and Computer Science, Cambridge, Mass. MIT AI Laboratory TR 1398.Google Scholar
  20. Widrow, B. and Hoff, M. 1960. Adaptive switching circuits. In 1960 IRE WESCON Convention Record, IRE, New York, 4:96-104.Google Scholar

Copyright information

© Kluwer Academic Publishers 1997

Authors and Affiliations

  • Paul Viola
    • 1
  • William M. Wells III
    • 2
    • 3
  1. 1.Massachusetts Institute of TechnologyArtificial Intelligence LaboratoryCambridge
  2. 2.Massachusetts Institute of TechnologyArtificial Intelligence LaboratoryUSA
  3. 3.Department of RadiologyHarvard Medical School and Brigham and Women's HospitalUSA

Personalised recommendations