Skip to main content
Log in

Model-based object pose in 25 lines of code

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In this paper, we describe a method for finding the pose of an object from a single image. We assume that we can detect and match in the image four or more noncoplanar feature points of the object, and that we know their relative geometry on the object. The method combines two algorithms; the first algorithm,POS (Pose from Orthography and Scaling) approximates the perspective projection with a scaled orthographic projection and finds the rotation matrix and the translation vector of the object by solving a linear system; the second algorithm,POSIT (POS with ITerations), uses in its iteration loop the approximate pose found by POS in order to compute better scaled orthographic projections of the feature points, then applies POS to these projections instead of the original image projections. POSIT converges to accurate pose measurements in a few iterations. POSIT can be used with many feature points at once for added insensitivity to measurement errors and image noise. Compared to classic approaches making use of Newton's method, POSIT does not require starting from an initial guess, and computes the pose using an order of magnitude fewer floating point operations; it may therefore be a useful alternative for real-time operation. When speed is not an issue, POSIT can be written in 25 lines or less in Mathematica; the code is provided in an Appendix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abidi, M.A. and Chandra 1991. T. A New Efficient and Direct Solution for Pose Estimation Using Quadrangular Targets: Algorithm and Evaluation. Dept. of Electrical and Computer Engineering, The University of Tennessee, to be published inIEEE Trans. on Pattern Analysis and Machine Intelligence.

  2. Basri, R. and Weinshall, D. 1992. Distance Metric between 3D Models and 2D Images for Recognition and Classification. MIT A.I. Memo No. 1373.

  3. Breuel, T.M. 1992. Fast Recognition using Adaptive Subdivisions of Transformation Space.Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Champaign, IL, pp. 445–451.

  4. DeMenthon, D.F. 1993. De la Vision Artificielle à la Réalité Synthétique: Système d'interaction avec un ordinateur utilisant l'analyse d'images vidéo. Doctoral Thesis, Université Joseph Fourier — Grenoble I, Laboratoire TIMC/IMAG.

  5. DeMenthon, D.F. 1993. Recognition and Tracking of 3D Objects by 1D Search. Proc. ARPA Image Understanding Workshop, Washington, DC, pp. 653–659.

  6. DeMenthon, D.F. and Davis, L.S. 1992. New Exact and Approximate Solutions of the Three-Point Perspective Problem.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 14, pp. 1100–1105.

    Google Scholar 

  7. DeMenthon, D.F. and Davis, L.S. 1992. Model-Based Object Pose in 25 Lines of Code.Computer Vision— ECCV 92, Lecture Notes in Computer Science 588, G. Sandini (Ed.), pp. 335–343, Springer-Verlag.

  8. DeMenthon, D.F. 1993. Computer Vision System for Position Monitoring in Three Dimensions using Non-Coplanar Light Sources Attached to a Monitored Object. U.S. Patent 5,227,985, (Application 07/747124, August 1991)

  9. DeMenthon, D.F. and Fujii, Y. 1994. Three Dimensional Pointing Device Monitored by Computer Vision. U.S. Patent 5,297,061, (Application 08/063489, May 1993).

  10. DeMenthon, D.F. 1995. Computer Vision System for Accurate Monitoring of Object Pose, U.S. Patent 5,388,059 (Patent Application 08/098470, December 1992).

  11. Dhome, M., Richetin, M., Lapreste, J.T., and Rives, G. 1989. Determination of the Attitude of 3D Objects from a Single Perspective View.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 11, pp. 1265–1278.

    Google Scholar 

  12. Egli, W.H., Miller, J.W. and Setterholm, J.M. 1987. Method and Apparatus for Determining Location and Orientation of Objets. U.S. Patent 4, 672, 562.

  13. Faugeras, O. 1993. Three-Dimensional Computer Vision—a Geometric ViewPoint. MIT Press.

  14. Fischler, M.A. and Bolles, R.C. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography.Comm. ACM, Vol. 24, pp. 381–395.

    Google Scholar 

  15. Haralick, R.M. 1992. Performance Characterization in Computer Vision, University of Washington C.S. Technical Report, July 1991; also Performance Characterization in Image Analysis: Thinning, a Case in Point. Pattern Recognition Letters, Vol. 13, pp. 5–12.

  16. Horaud, R., Conio, B., and Leboulleux, O. 1989. An Analytical Solution for the Perspective-4-Point Problem.Computer Vision, Graphics, and Image Processing, Vol. 47, pp. 33–44.

    Google Scholar 

  17. Huttenlocher and D. and Ullman, S. 1988. Recognizing Solid Objects by Alignment.Proc. DARPA Image Understanding Workshop, pp. 1114–1122.

  18. Lowe, D.G., 1985. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers.

  19. Lowe, D. G., 1991. Fitting Parameterized Three-Dimensional Models to Images.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 13, pp. 441–450.

    Google Scholar 

  20. Maybank, S.J. 1992. The Projective Geometry of Ambiguous Surfaces.Phil. Trans. R. Soc. Lond. A 332, pp. 1–47.

    Google Scholar 

  21. Meyer, K., Applewhite, H.L., and Biocca, F.A. 1992. A Survey of Position Trackers. Presence, Vol. 1, pp. 173–200, Spring.

    Google Scholar 

  22. Oberkampf, D., DeMenthon, D.F., and Davis, L.S. 1993. Iterative Pose Estimation using Coplanar Feature Points.IEEE Conf. on Computer Vision and Pattern Recognition, pp. 626–627, New York, 1993; full version: Center for Automation Research Technical Report CAR-TR-677, University of Maryland.

  23. Press, W.H., Flannery, B.P., Teukolsky, S.A., and Veterling. W.T. 1988.Numerical Recipes in C, Cambridge University Press, Cambridge, UK.

    Google Scholar 

  24. Roberts, L.G. 1965. Machine Perception of Three-Dimensional Solids. InOptical and Electrooptical Information Processing, J. Tippet et al., eds., MIT Press.

  25. Sutherland, I.E. 1974. Three-Dimensional Input by Tablet.Proceedings of the IEEE, Vol. 62, pp. 453–461.

    Google Scholar 

  26. Tomasi, C. 1991. Shape and Motion from Image Streams: A Factorization Method. Technical Report CMU-CS-91-172. Carnegie Mellon University.

  27. Tsai, R.Y. 1987. A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses.IEEE J. Robotics and Automation, Vol. 3, pp. 323–344.

    Google Scholar 

  28. Ullman, S. and Basri, R. 1991. Recognition by Linear Combinations of Models.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 13, pp. 992–1006.

    Google Scholar 

  29. Yuan, J.S.C. 1989. A General Photogrammetric Method for Determining Object Position and Orientation.IEEE Trans. on Robotics and Automation, Vol. 5, pp. 129–142.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dementhon, D.F., Davis, L.S. Model-based object pose in 25 lines of code. Int J Comput Vision 15, 123–141 (1995). https://doi.org/10.1007/BF01450852

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01450852

Keywords

Navigation