Non-Euclidean object representations for calibration-free video overlay

  • Kiriakos N. Kutulakos
  • James R. Vallino
3D Representations and Applications
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1144)


We show that the overlay of 3D graphical objects onto live video taken by a mobile camera can be considerably simplified when the camera, the camera's environment and the graphical objects are represented in an affine frame of reference. The key feature of the approach is that it does not use any metric information about the calibration parameters of the camera, the position of the user interacting with the system, or the 3D locations and dimensions of the environment's objects. The only requirement is the ability to track across frames at least four features (points or lines) that are specified by the user at system initialization time and whose world coordinates are unknown. Our approach is based on the following observation: Given a set of four or more non-coplanar 3D points, the projection of all points in the set can be computed as a linear combination of the projections of just four of the points. We exploit this observation by (1) tracking lines and feature points at frame rate, and (2) representing graphical objects in an affine frame of reference that allows the projection of virtual objects to be computed as a linear combination of the projection of the feature points.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    W. Grimson et al., “An automatic registration method for frameless stereotaxy, image guided surgery, and enhanced reality visualization,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 430–436, 1994.Google Scholar
  2. 2.
    M. Uenohara and T. Kanade, “Vision-based object registration for real-time image overlay,” in Proc. CVRMED'95, pp. 14–22, 1995.Google Scholar
  3. 3.
    M. Bajura, H. Fuchs, and R. Ohbuchi, “Merging virtual objects with the real world: Seeing ultrasound imagery within the patient,” in Proc. SIGGRAPH'92, pp. 203–210, 1992.Google Scholar
  4. 4.
    S. Feiner, B. MacIntyre, and D. Soligmann, “Knowledge-based augmented reality,” Comm. of the ACM, vol. 36, no. 7, pp. 53–62, 1993.Google Scholar
  5. 5.
    T. Darrell, P. Maes, B. Blumberg, and A. P. Pentland, “A novel environment for situated vision and action,” in IEEE Workshop on Visual Behaviors, pp. 68–72, 1994.Google Scholar
  6. 6.
    M. M. Wloka and B. G. Anderson, “Resolving occlusion in augmented reality,” in Proc. Symposium on Interactive 3D Graphics, pp. 5–12, 1995.Google Scholar
  7. 7.
    M. Tuceyran et al., “Calibration requirements and procedures for a monitor-based augmented reality system,” IEEE Trans. Visualization and Computer Graphics, vol. 1, no. 3, pp. 255–273, 1995.Google Scholar
  8. 8.
    J. Mellor, “Enhanced reality visualization in a surgical environment,” Master's thesis, Massachusetts Institute of Technology, 1995.Google Scholar
  9. 9.
    M. Bajura and U. Neumann, “Dynamic registration correction in video-based augmented reality systems,” IEEE Computer Graphics and Applications, vol. 15, no. 5, pp. 52–60, 1995.Google Scholar
  10. 10.
    D. G. Lowe, “Robust model-based tracking through the integration of search and estimation,” Int. J. Computer Vision, vol. 8, no. 2, pp. 113–122, 1992.Google Scholar
  11. 11.
    S. Ravela, B. Draper, et al., “Adaptive tracking and model registration across distinct aspects,” in Proc. 1995 IEEE/RSJ Int. Conf. Intelligent Robotics and Systems, pp. 174–180, 1995.Google Scholar
  12. 12.
    L. S. Shapiro, A. Zisserman, and M. Brady, “3D motion recovery via affine epipolar geometry,” Int. J. Computer Vision, vol. 16, no. 2, pp. 147–182, 1995.Google Scholar
  13. 13.
    J. J. Koenderink and A. J. van Doorn, “Affine structure from motion,” J. Opt. Soc. Am., vol. A, no. 2, pp. 377–385, 1991.Google Scholar
  14. 14.
    G. D. Hager, “Calibration-free visual control using projective invariance,” in Proc. 5th Int. Conf. Computer Vision, 1995.Google Scholar
  15. 15.
    R. Cipolla, P. A. Hadfield, and N. J. Hollinghurst, “Uncalibrated stereo vision with pointing for a man-machine interface,” in Proc. IAPR Workshop on Machine Vision Applications, 1994.Google Scholar
  16. 16.
    A. Azarbayejani, T. Starner, B. Horowitz, and A. Pentland, “Visually controlled graphics,” IEEE Trans. Pattern Anal. Machine Intell., vol. 15, no. 6, pp. 602–605, 1993.Google Scholar
  17. 17.
    A. Shashua, “A geometric invariant for visual recognition and 3D reconstruction from two perspective/orthographic views,” in Proc. IEEE Workshop on Qualitative Vision, pp. 107–117, 1993.Google Scholar
  18. 18.
    E. B. Barrett, M. H. Brill, N. N. Haag, and P. M. Payton, “Invariant linear methods in photogrammetry and model-matching,” in Geometric Invariance in Computer Vision, pp. 277–292, MIT Press, 1992.Google Scholar
  19. 19.
    P. A. Beardsley, I. D. Reid, A. Zisserman, and D. W. Murray, “Active visual navigation using non-metric structure,” in Proc. 5th Int. Conf. Computer Vision, pp. 58–64, 1995.Google Scholar
  20. 20.
    Y. Lamdan, J. T. Schwartz, and H. J. Wolfson, “Object recognition by affine invariant matching,” in Proc. Computer Vision and Pattern Recognition, pp. 335–344, 1988.Google Scholar
  21. 21.
    J. D. Foley, A. van Dam, S. K. Feiner, and J. F. Hughes, Computer Graphics Principles and Practice. Addison-Wesley Publishing Co., 1990.Google Scholar
  22. 22.
    Y. Bar-Shalom and T. E. Fortmann, Tracking and Data Association. Academic Press, 1988.Google Scholar
  23. 23.
    M. Gleicher and A. Witkin, “Through-the-lens camera control,” in Proc. SIGGRAPH'92, pp. 331–340, 1992.Google Scholar
  24. 24.
    J. L. Mundy and A. Zisserman, eds., Geometric Invariance in Computer Vision. MIT Press, 1992.Google Scholar
  25. 25.
    D. Weinshall and C. Tomasi, “Linear and incremental acquisition of invariant shape models from image sequences,” in Proc. 4th Int. Conf. on Computer Vision, pp. 675–682, 1993.Google Scholar
  26. 26.
    O. D. Faugeras, Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press, 1993.Google Scholar
  27. 27.
    S. M. Seitz and C. R. Dyer, “Complete scene structure from four point correspondences,” in Proc. 5th Int. Conf. on Computer Vision, pp. 330–337, 1995.Google Scholar
  28. 28.
    A. Blake and A. Yuille, eds., Active Vision. MIT Press, 1992.Google Scholar
  29. 29.
    C. M. Brown and D. Terzopoulos, eds., Real-Time Computer Vision. Cambridge University Press, 1994.Google Scholar
  30. 30.
    A. Blake and M. Isard, “3D position, attitude and shape input using video tracking of hands and lips,” in ACM SIGGRAPH'94, pp. 185–192, 1994.Google Scholar
  31. 31.
    K. Toyama and G. D. Hager, “Incremental focus of attention for robust visual tracking,” in Proc. Computer Vision and Pattern Recognition, 1996. To appear.Google Scholar
  32. 32.
    C. Harris, “Tracking with rigid models,” in Active Vision (A. Blake and A. Yuille, eds.), pp. 21–38, MIT Press, 1992.Google Scholar
  33. 33.
    R. Horaud, F. Dornaika, B. Boufama, and R. Mohr, “Self calibration of a stereo head mounted onto a robot arm,” in Proc. 3rd European Conf. on Computer Vision, pp. 455–462, 1994.Google Scholar

Copyright information

© Springer-Verlag 1996

Authors and Affiliations

  • Kiriakos N. Kutulakos
    • 1
  • James R. Vallino
    • 1
  1. 1.Computer Science DepartmentUniversity of RochesterRochester

Personalised recommendations