Model-based object pose in 25 lines of code

  • Daniel F. Dementhon
  • Larry S. Davis


In this paper, we describe a method for finding the pose of an object from a single image. We assume that we can detect and match in the image four or more noncoplanar feature points of the object, and that we know their relative geometry on the object. The method combines two algorithms; the first algorithm,POS (Pose from Orthography and Scaling) approximates the perspective projection with a scaled orthographic projection and finds the rotation matrix and the translation vector of the object by solving a linear system; the second algorithm,POSIT (POS with ITerations), uses in its iteration loop the approximate pose found by POS in order to compute better scaled orthographic projections of the feature points, then applies POS to these projections instead of the original image projections. POSIT converges to accurate pose measurements in a few iterations. POSIT can be used with many feature points at once for added insensitivity to measurement errors and image noise. Compared to classic approaches making use of Newton's method, POSIT does not require starting from an initial guess, and computes the pose using an order of magnitude fewer floating point operations; it may therefore be a useful alternative for real-time operation. When speed is not an issue, POSIT can be written in 25 lines or less in Mathematica; the code is provided in an Appendix.


Posit Original Image Feature Point Initial Guess Image Noise 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abidi, M.A. and Chandra 1991. T. A New Efficient and Direct Solution for Pose Estimation Using Quadrangular Targets: Algorithm and Evaluation. Dept. of Electrical and Computer Engineering, The University of Tennessee, to be published inIEEE Trans. on Pattern Analysis and Machine Intelligence.Google Scholar
  2. 2.
    Basri, R. and Weinshall, D. 1992. Distance Metric between 3D Models and 2D Images for Recognition and Classification. MIT A.I. Memo No. 1373.Google Scholar
  3. 3.
    Breuel, T.M. 1992. Fast Recognition using Adaptive Subdivisions of Transformation Space.Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Champaign, IL, pp. 445–451.Google Scholar
  4. 4.
    DeMenthon, D.F. 1993. De la Vision Artificielle à la Réalité Synthétique: Système d'interaction avec un ordinateur utilisant l'analyse d'images vidéo. Doctoral Thesis, Université Joseph Fourier — Grenoble I, Laboratoire TIMC/IMAG.Google Scholar
  5. 5.
    DeMenthon, D.F. 1993. Recognition and Tracking of 3D Objects by 1D Search. Proc. ARPA Image Understanding Workshop, Washington, DC, pp. 653–659.Google Scholar
  6. 6.
    DeMenthon, D.F. and Davis, L.S. 1992. New Exact and Approximate Solutions of the Three-Point Perspective Problem.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 14, pp. 1100–1105.Google Scholar
  7. 7.
    DeMenthon, D.F. and Davis, L.S. 1992. Model-Based Object Pose in 25 Lines of Code.Computer Vision— ECCV 92, Lecture Notes in Computer Science 588, G. Sandini (Ed.), pp. 335–343, Springer-Verlag.Google Scholar
  8. 8.
    DeMenthon, D.F. 1993. Computer Vision System for Position Monitoring in Three Dimensions using Non-Coplanar Light Sources Attached to a Monitored Object. U.S. Patent 5,227,985, (Application 07/747124, August 1991)Google Scholar
  9. 9.
    DeMenthon, D.F. and Fujii, Y. 1994. Three Dimensional Pointing Device Monitored by Computer Vision. U.S. Patent 5,297,061, (Application 08/063489, May 1993).Google Scholar
  10. 10.
    DeMenthon, D.F. 1995. Computer Vision System for Accurate Monitoring of Object Pose, U.S. Patent 5,388,059 (Patent Application 08/098470, December 1992).Google Scholar
  11. 11.
    Dhome, M., Richetin, M., Lapreste, J.T., and Rives, G. 1989. Determination of the Attitude of 3D Objects from a Single Perspective View.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 11, pp. 1265–1278.Google Scholar
  12. 12.
    Egli, W.H., Miller, J.W. and Setterholm, J.M. 1987. Method and Apparatus for Determining Location and Orientation of Objets. U.S. Patent 4, 672, 562.Google Scholar
  13. 13.
    Faugeras, O. 1993. Three-Dimensional Computer Vision—a Geometric ViewPoint. MIT Press.Google Scholar
  14. 14.
    Fischler, M.A. and Bolles, R.C. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography.Comm. ACM, Vol. 24, pp. 381–395.Google Scholar
  15. 15.
    Haralick, R.M. 1992. Performance Characterization in Computer Vision, University of Washington C.S. Technical Report, July 1991; also Performance Characterization in Image Analysis: Thinning, a Case in Point. Pattern Recognition Letters, Vol. 13, pp. 5–12.Google Scholar
  16. 16.
    Horaud, R., Conio, B., and Leboulleux, O. 1989. An Analytical Solution for the Perspective-4-Point Problem.Computer Vision, Graphics, and Image Processing, Vol. 47, pp. 33–44.Google Scholar
  17. 17.
    Huttenlocher and D. and Ullman, S. 1988. Recognizing Solid Objects by Alignment.Proc. DARPA Image Understanding Workshop, pp. 1114–1122.Google Scholar
  18. 18.
    Lowe, D.G., 1985. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers.Google Scholar
  19. 19.
    Lowe, D. G., 1991. Fitting Parameterized Three-Dimensional Models to Images.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 13, pp. 441–450.Google Scholar
  20. 20.
    Maybank, S.J. 1992. The Projective Geometry of Ambiguous Surfaces.Phil. Trans. R. Soc. Lond. A 332, pp. 1–47.Google Scholar
  21. 21.
    Meyer, K., Applewhite, H.L., and Biocca, F.A. 1992. A Survey of Position Trackers. Presence, Vol. 1, pp. 173–200, Spring.Google Scholar
  22. 22.
    Oberkampf, D., DeMenthon, D.F., and Davis, L.S. 1993. Iterative Pose Estimation using Coplanar Feature Points.IEEE Conf. on Computer Vision and Pattern Recognition, pp. 626–627, New York, 1993; full version: Center for Automation Research Technical Report CAR-TR-677, University of Maryland.Google Scholar
  23. 23.
    Press, W.H., Flannery, B.P., Teukolsky, S.A., and Veterling. W.T. 1988.Numerical Recipes in C, Cambridge University Press, Cambridge, UK.Google Scholar
  24. 24.
    Roberts, L.G. 1965. Machine Perception of Three-Dimensional Solids. InOptical and Electrooptical Information Processing, J. Tippet et al., eds., MIT Press.Google Scholar
  25. 25.
    Sutherland, I.E. 1974. Three-Dimensional Input by Tablet.Proceedings of the IEEE, Vol. 62, pp. 453–461.Google Scholar
  26. 26.
    Tomasi, C. 1991. Shape and Motion from Image Streams: A Factorization Method. Technical Report CMU-CS-91-172. Carnegie Mellon University.Google Scholar
  27. 27.
    Tsai, R.Y. 1987. A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses.IEEE J. Robotics and Automation, Vol. 3, pp. 323–344.Google Scholar
  28. 28.
    Ullman, S. and Basri, R. 1991. Recognition by Linear Combinations of Models.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 13, pp. 992–1006.Google Scholar
  29. 29.
    Yuan, J.S.C. 1989. A General Photogrammetric Method for Determining Object Position and Orientation.IEEE Trans. on Robotics and Automation, Vol. 5, pp. 129–142.Google Scholar

Copyright information

© Kluwer Academic Publishers 1995

Authors and Affiliations

  • Daniel F. Dementhon
    • 1
  • Larry S. Davis
    • 1
  1. 1.Computer Vision Laboratory, Center for Automation ResearchUniversity of MarylandCollege Park

Personalised recommendations