Model-based object pose in 25 lines of code

Dementhon, Daniel F.; Davis, Larry S.

doi:10.1007/BF01450852

Model-based object pose in 25 lines of code

Published: June 1995

Volume 15, pages 123–141, (1995)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Daniel F. Dementhon¹ &
Larry S. Davis¹

2572 Accesses
693 Citations
15 Altmetric
Explore all metrics

Abstract

In this paper, we describe a method for finding the pose of an object from a single image. We assume that we can detect and match in the image four or more noncoplanar feature points of the object, and that we know their relative geometry on the object. The method combines two algorithms; the first algorithm,POS (Pose from Orthography and Scaling) approximates the perspective projection with a scaled orthographic projection and finds the rotation matrix and the translation vector of the object by solving a linear system; the second algorithm,POSIT (POS with ITerations), uses in its iteration loop the approximate pose found by POS in order to compute better scaled orthographic projections of the feature points, then applies POS to these projections instead of the original image projections. POSIT converges to accurate pose measurements in a few iterations. POSIT can be used with many feature points at once for added insensitivity to measurement errors and image noise. Compared to classic approaches making use of Newton's method, POSIT does not require starting from an initial guess, and computes the pose using an order of magnitude fewer floating point operations; it may therefore be a useful alternative for real-time operation. When speed is not an issue, POSIT can be written in 25 lines or less in Mathematica; the code is provided in an Appendix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abidi, M.A. and Chandra 1991. T. A New Efficient and Direct Solution for Pose Estimation Using Quadrangular Targets: Algorithm and Evaluation. Dept. of Electrical and Computer Engineering, The University of Tennessee, to be published inIEEE Trans. on Pattern Analysis and Machine Intelligence.
Basri, R. and Weinshall, D. 1992. Distance Metric between 3D Models and 2D Images for Recognition and Classification. MIT A.I. Memo No. 1373.
Breuel, T.M. 1992. Fast Recognition using Adaptive Subdivisions of Transformation Space.Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Champaign, IL, pp. 445–451.
DeMenthon, D.F. 1993. De la Vision Artificielle à la Réalité Synthétique: Système d'interaction avec un ordinateur utilisant l'analyse d'images vidéo. Doctoral Thesis, Université Joseph Fourier — Grenoble I, Laboratoire TIMC/IMAG.
DeMenthon, D.F. 1993. Recognition and Tracking of 3D Objects by 1D Search. Proc. ARPA Image Understanding Workshop, Washington, DC, pp. 653–659.
DeMenthon, D.F. and Davis, L.S. 1992. New Exact and Approximate Solutions of the Three-Point Perspective Problem.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 14, pp. 1100–1105.
Google Scholar
DeMenthon, D.F. and Davis, L.S. 1992. Model-Based Object Pose in 25 Lines of Code.Computer Vision— ECCV 92, Lecture Notes in Computer Science 588, G. Sandini (Ed.), pp. 335–343, Springer-Verlag.
DeMenthon, D.F. 1993. Computer Vision System for Position Monitoring in Three Dimensions using Non-Coplanar Light Sources Attached to a Monitored Object. U.S. Patent 5,227,985, (Application 07/747124, August 1991)
DeMenthon, D.F. and Fujii, Y. 1994. Three Dimensional Pointing Device Monitored by Computer Vision. U.S. Patent 5,297,061, (Application 08/063489, May 1993).
DeMenthon, D.F. 1995. Computer Vision System for Accurate Monitoring of Object Pose, U.S. Patent 5,388,059 (Patent Application 08/098470, December 1992).
Dhome, M., Richetin, M., Lapreste, J.T., and Rives, G. 1989. Determination of the Attitude of 3D Objects from a Single Perspective View.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 11, pp. 1265–1278.
Google Scholar
Egli, W.H., Miller, J.W. and Setterholm, J.M. 1987. Method and Apparatus for Determining Location and Orientation of Objets. U.S. Patent 4, 672, 562.
Faugeras, O. 1993. Three-Dimensional Computer Vision—a Geometric ViewPoint. MIT Press.
Fischler, M.A. and Bolles, R.C. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography.Comm. ACM, Vol. 24, pp. 381–395.
Google Scholar
Haralick, R.M. 1992. Performance Characterization in Computer Vision, University of Washington C.S. Technical Report, July 1991; also Performance Characterization in Image Analysis: Thinning, a Case in Point. Pattern Recognition Letters, Vol. 13, pp. 5–12.
Horaud, R., Conio, B., and Leboulleux, O. 1989. An Analytical Solution for the Perspective-4-Point Problem.Computer Vision, Graphics, and Image Processing, Vol. 47, pp. 33–44.
Google Scholar
Huttenlocher and D. and Ullman, S. 1988. Recognizing Solid Objects by Alignment.Proc. DARPA Image Understanding Workshop, pp. 1114–1122.
Lowe, D.G., 1985. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers.
Lowe, D. G., 1991. Fitting Parameterized Three-Dimensional Models to Images.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 13, pp. 441–450.
Google Scholar
Maybank, S.J. 1992. The Projective Geometry of Ambiguous Surfaces.Phil. Trans. R. Soc. Lond. A 332, pp. 1–47.
Google Scholar
Meyer, K., Applewhite, H.L., and Biocca, F.A. 1992. A Survey of Position Trackers. Presence, Vol. 1, pp. 173–200, Spring.
Google Scholar
Oberkampf, D., DeMenthon, D.F., and Davis, L.S. 1993. Iterative Pose Estimation using Coplanar Feature Points.IEEE Conf. on Computer Vision and Pattern Recognition, pp. 626–627, New York, 1993; full version: Center for Automation Research Technical Report CAR-TR-677, University of Maryland.
Press, W.H., Flannery, B.P., Teukolsky, S.A., and Veterling. W.T. 1988.Numerical Recipes in C, Cambridge University Press, Cambridge, UK.
Google Scholar
Roberts, L.G. 1965. Machine Perception of Three-Dimensional Solids. InOptical and Electrooptical Information Processing, J. Tippet et al., eds., MIT Press.
Sutherland, I.E. 1974. Three-Dimensional Input by Tablet.Proceedings of the IEEE, Vol. 62, pp. 453–461.
Google Scholar
Tomasi, C. 1991. Shape and Motion from Image Streams: A Factorization Method. Technical Report CMU-CS-91-172. Carnegie Mellon University.
Tsai, R.Y. 1987. A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses.IEEE J. Robotics and Automation, Vol. 3, pp. 323–344.
Google Scholar
Ullman, S. and Basri, R. 1991. Recognition by Linear Combinations of Models.IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 13, pp. 992–1006.
Google Scholar
Yuan, J.S.C. 1989. A General Photogrammetric Method for Determining Object Position and Orientation.IEEE Trans. on Robotics and Automation, Vol. 5, pp. 129–142.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Laboratory, Center for Automation Research, University of Maryland, 20742, College Park, MD
Daniel F. Dementhon & Larry S. Davis

Authors

Daniel F. Dementhon
View author publications
You can also search for this author in PubMed Google Scholar
Larry S. Davis
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dementhon, D.F., Davis, L.S. Model-based object pose in 25 lines of code. Int J Comput Vision 15, 123–141 (1995). https://doi.org/10.1007/BF01450852

Download citation

Received: 15 October 1992
Revised: 15 March 1994
Accepted: 15 May 1994
Issue Date: June 1995
DOI: https://doi.org/10.1007/BF01450852

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model-based object pose in 25 lines of code

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Fiducial Markers for Pose Estimation

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Model-based object pose in 25 lines of code

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Fiducial Markers for Pose Estimation

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation