Skip to main content
Log in

Defining the Pose of Any 3D Rigid Object and an Associated Distance

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The pose of a rigid object is usually regarded as a rigid transformation, described by a translation and a rotation. However, equating the pose space with the space of rigid transformations is in general abusive, as it does not account for objects with proper symmetries—which are common among man-made objects. In this article, we define pose as a distinguishable static state of an object, and equate a pose to a set of rigid transformations. Based solely on geometric considerations, we propose a frame-invariant metric on the space of possible poses, valid for any physical rigid object, and requiring no arbitrary tuning. This distance can be evaluated efficiently using a representation of poses within a Euclidean space of at most 12 dimensions depending on the object’s symmetries. This makes it possible to efficiently perform neighborhood queries such as radius searches or k-nearest neighbor searches within a large set of poses using off-the-shelf methods. Pose averaging considering this metric can similarly be performed easily, using a projection function from the Euclidean space onto the pose space. The practical value of those theoretical developments is illustrated with an application of pose estimation of instances of a 3D rigid object given an input depth map, via a Mean Shift procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Angeles, J. (2006). Is there a characteristic length of a rigid-body displacement? Mechanism and Machine Theory, 41(8), 884–896.

    Article  MATH  Google Scholar 

  • Belta, C., & Kumar, V. (2002). An SVD-based projection method for interpolation on SE (3). IEEE Transactions on Robotics and Automation, 18(3), 334–345.

    Article  Google Scholar 

  • Besl, P. J., & McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239–256. https://doi.org/10.1109/34.121791.

    Article  Google Scholar 

  • Blender Online Community. (2016). Blender—A 3D modelling and rendering package. Blender Foundation, Blender Institute, Amsterdam. http://www.blender.org

  • Brégier, R., Devernay, F., Leyrit, L., et al. (2017). Symmetry aware evaluation of 3D object detection and pose estimation in scenes of many parts in bulk. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2209–2218).

  • Chirikjian, G. S. (2015). Partial bi-invariance of SE (3) metrics. Journal of Computing and Information Science in Engineering, 15(1), 011,008.

    Article  Google Scholar 

  • Chirikjian, G. S., & Zhou, S. (1998). Metrics on motion and deformation of solid models. Journal of Mechanical Design, 120(2), 252–261.

    Article  Google Scholar 

  • Curtis, W., Janin, A., & Zikan, K. (1993). A note on averaging rotations. In 1993 IEEE virtual reality annual international symposium, 1993, pp. 377–385. https://doi.org/10.1109/VRAIS.1993.380755.

  • Di Gregorio, R. (2008). A novel point of view to define the distance between two rigid-body poses. In J. Lenarčič & P. Wenger (Eds.), Advances in robot kinematics: Analysis and design (pp. 361–369). Dordrecht: Springer. https://doi.org/10.1007/978-1-4020-8600-7_38.

  • Drost, B., Ulrich, M., Navab, N., & Ilic, S. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In 2010 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 998–1005). IEEE.

  • Eberharter, J. K., & Ravani, B. (2004). Local metrics for rigid body displacements. Journal of Mechanical Design, 126(5), 805–812. https://doi.org/10.1115/1.1767816.

    Article  Google Scholar 

  • Etzel, K. R., & McCarthy, J. M. (1996). A metric for spatial displacement using biquaternions on so (4). In Proceedings of the 1996 IEEE international conference on robotics and automation, 1996 (Vol. 4, pp. 3185–3190). IEEE.

  • Fanelli, G., Gall, J., & Van Gool, L. (2011). Real time head pose estimation with random regression forests. In 2011 IEEE conference on computer vision and pattern recognition (CVPR), pp 617–624. https://doi.org/10.1109/CVPR.2011.5995458.

  • Fukunaga, K., & Hostetler, L. D. (1975). The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Transactions on Information Theory, 21(1), 32–40.

    Article  MathSciNet  MATH  Google Scholar 

  • Gramkow, C. (2001). On averaging rotations. Journal of Mathematical Imaging and Vision, 15(1–2), 7–16.

    Article  MathSciNet  MATH  Google Scholar 

  • Gupta, K. C. (1997). Measures of positional error for a rigid body. Journal of Mechanical Design, 119(3), 346–348.

    Article  Google Scholar 

  • Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., & Navab, N. (2012). Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In Asian conference on computer vision (pp. 548–562). Springer.

  • Kazerounian, K., & Rastegar, J. (1992). Object norms: A class of coordinate and metric independent norms for displacements. Flexible Mechanisms, Dynamics, and Analysis ASME DE, 47, 271–275.

    Google Scholar 

  • Kendall, A., Grimes, M., & Cipolla, R. (2015). PoseNet: A convolutional network for real-time 6-DOF camera relocalization. In Proceedings of the IEEE international conference on computer vision, pp 2938–2946.

  • Larochelle, P. M., Murray, A. P., & Angeles, J. (2007). A distance metric for finite sets of rigid-body displacements via the polar decomposition. Journal of Mechanical Design, 129(8), 883–886.

    Article  Google Scholar 

  • Lin, Q., & Burdick, J. W. (2000). Objective and frame-invariant kinematic metric functions for rigid bodies. The International Journal of Robotics Research, 19(6), 612–625.

    Article  Google Scholar 

  • Martinez, J. M. R., & Duffy, J. (1995). On the metrics of rigid body displacements for infinite and finite bodies. Journal of Mechanical Design, 117(1), 41–47.

    Article  Google Scholar 

  • Muja, M., & Lowe, D. G. (2009). Fast approximate nearest neighbors with automatic algorithm configuration. In VISAPP, number 1, pp. 331–340.

  • Park, F. C. (1995). Distance metrics on the rigid-body motions with applications to mechanism design. Journal of Mechanical Design, 117(1), 48–54.

    Article  Google Scholar 

  • Pelletier, B. (2005). Kernel density estimation on riemannian manifolds. Statistics & Probability Letters, 73(3), 297–304. https://doi.org/10.1016/j.spl.2005.04.004.

    Article  MathSciNet  MATH  Google Scholar 

  • Pennec, X. (1998). Computing the mean of geometric features application to the mean rotation. Report, INRIA.

  • Purwar, A., & Ge, Q. J. (2009). Reconciling distance metric methods for rigid body displacements. In ASME 2009 international design engineering technical conferences and computers and information in engineering conference (pp. 1295–1304). American Society of Mechanical Engineers.

  • Rodrigues, J. J., Kim, J., Furukawa, M., Xavier, J., Aguiar, P., & Kanade, T. (2012). 6D pose estimation of textureless shiny objects using random ferns for bin-picking. In 2012 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 3334–3341). IEEE.

  • Schönemann, P. H. (1966). A generalized solution of the orthogonal procrustes problem. Psychometrika, 31(1), 1–10.

    Article  MathSciNet  MATH  Google Scholar 

  • Sharf, I., Wolf, A., & Rubin, M. (2010). Arithmetic and geometric solutions for average rigid-body rotation. Mechanism and Machine Theory, 45(9), 1239–1251. https://doi.org/10.1016/j.mechmachtheory.2010.05.002.

    Article  MATH  Google Scholar 

  • Subbarao, R., & Meer, P. (2006). Nonlinear mean shift for clustering over analytic manifolds. In 2006 IEEE computer society conference on computer vision and pattern recognition (Vol. 1, pp. 1168–1175). IEEE.

  • Sucan, I., Moll, M., & Kavraki, L. (2012). The open motion planning library. IEEE Robotics Automation Magazine, 19(4), 72–82. https://doi.org/10.1109/MRA.2012.2205651.

    Article  Google Scholar 

  • Tejani, A., Tang, D., Kouskouridas, R., & Kim, T. (2014). Latent-class hough forests for 3D object detection and pose estimation. In Computer vision—ECCV 2014 (pp. 462–477). Springer.

  • Tjaden, H., Schwanecke, U., & Schömer, E. (2016). Real-time monocular segmentation and pose tracking of multiple objects. In European conference on computer vision (pp. 423–438). Springer

  • Tuzel, O., Subbarao, R., & Meer, P. (2005). Simultaneous multiple 3D motion estimation via mode finding on lie groups. In Tenth IEEE international conference on computer vision, 2005, ICCV 2005 (Vol. 1, pp. 18–25). IEEE.

  • Umeyama, S. (1991). Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(4), 376–380.

    Article  Google Scholar 

  • Vainsthein, B. K. (1994). Fundamentals of crystals. Berlin: Springer.

    Book  Google Scholar 

  • Zefran, M., & Kumar, V. (1996). Planning of smooth motions on SE (3). In Proceedings of the 1996 IEEE international conference on robotics and automation, 1996 (Vol. 1, pp. 121–126). IEEE.

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their insightful comments and suggestions that greatly helped to improve this article. Some of our illustrations are based on the following mesh models: “Stanford bunny”, from the Stanford University Computer Graphics Laboratory; “Eiffel Tower” created by Pranav Panchal; and “Şamdan 2” (candlestick), from Metin N. Those were respectively available online at http://graphics.stanford.edu/data/3Dscanrep, and the GrabCAD and 3D Warehouse plateforms on May 2016.

Funding

Funding is provided by Association Nationale de la Recherche et de la Technologie (CIFRE 2014/0173).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Romain Brégier.

Additional information

Communicated by Lourdes Agapito.

Appendices

Appendix A: Distance Simplification for a Revolution Object Without Rotoreflection Invariance

Using the same definition of \(\varvec{\varLambda }\) as in Sect. 5.3, the rotation part of the proposed distance for a revolution object without rotoreflection invariance can be rewritten in the following way:

$$\begin{aligned} \begin{aligned}&{{\mathrm{d_{rot}^2}}}(\mathscr {P}_1, \mathscr {P}_2) \\&\quad = \min _{\phi _1, \phi _2} \frac{1}{S} \int _{\mathscr {S}} \mu (\mathbf {x}) \Vert \mathbf {R}_2 \mathbf {R}_z^{\phi _2} \mathbf {x} - \mathbf {R}_1 \mathbf {R}_z^{\phi _1} \mathbf {x}\Vert ^2 ds \\&\quad =\min _{\phi _1, \phi _2} \Vert \mathbf {R}_2 \mathbf {R}_z^{\phi _2} \varvec{\varLambda }- \mathbf {R}_1 \mathbf {R}_z^{\phi _1} \varvec{\varLambda }\Vert _F^2. \end{aligned} \end{aligned}$$
(100)

Frobenius norm being invariant under rotations, this expression can be rewritten with the relative rotation \(\mathbf {R} \triangleq \mathbf {R}_1^{-1} \mathbf {R}_2\):

$$\begin{aligned} {{\mathrm{d_{rot}^2}}}(\mathscr {P}_1, \mathscr {P}_2) = \min _{\phi _1, \phi _2} \Vert \mathbf {R}_z^{-\phi _1} \mathbf {R} \mathbf {R}_z^{\phi _2} \varvec{\varLambda }- \varvec{\varLambda }\Vert _F^2. \end{aligned}$$
(101)

We parametrize \(\mathbf {R}\) using Euler angles \((\tilde{\psi }, \theta , \tilde{\phi }) \in \mathbb {R}^3\) such as \(\mathbf {R} = \mathbf {R}_z^{\tilde{\psi }} \mathbf {R}_x^\theta \mathbf {R}_z^{\tilde{\phi }}\), considering the following elementary rotations:

$$\begin{aligned} \mathbf {R}_z^{\alpha } \triangleq \left( {\begin{matrix} \cos (\alpha ) &{}\quad -\sin (\alpha ) &{}\quad 0 \\ \sin (\alpha ) &{}\quad \cos (\alpha ) &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 1 \end{matrix}} \right) , \mathbf {R}_x^{\alpha } \triangleq \left( {\begin{matrix} 1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad \cos (\alpha ) &{}\quad -\sin (\alpha ) \\ 0 &{}\quad \sin (\alpha ) &{}\quad \cos (\alpha ) \end{matrix}} \right) . \end{aligned}$$
(102)

Injecting this parametrization into the previous expression and performing the changes of variables \(\psi \leftarrow \tilde{\psi } -\phi _1\) and \(\phi \leftarrow \tilde{\phi } + \phi _2\) leads us to the following expression:

$$\begin{aligned} \begin{aligned} {{\mathrm{d_{rot}^2}}}(\mathscr {P}_1, \mathscr {P}_2)&= \min _{\phi _1, \phi _2} \Vert \mathbf {R}_z^{-\phi _1} \mathbf {R}_z^{\tilde{\psi }} \mathbf {R}_x^\theta \mathbf {R}_z^{\tilde{\phi }} \mathbf {R}_z^{\phi _2} \varvec{\varLambda }- \varvec{\varLambda }\Vert _F^2 \\&= \min _{\psi , \phi } \Vert \mathbf {R}_z^\psi \mathbf {R}_x^\theta \mathbf {R}_z^\phi \varvec{\varLambda }- \varvec{\varLambda }\Vert _F^2. \end{aligned} \end{aligned}$$
(103)

Because of the specific shape of \(\varvec{\varLambda }\) (Eq. 30), the term to minimize can be decomposed into two parts:

$$\begin{aligned}&\Vert \mathbf {R}_z^\psi \mathbf {R}_x^\theta \mathbf {R}_z^\phi \varvec{\varLambda }- \varvec{\varLambda }\Vert _F^2 = \lambda _z^2 \underbrace{\Vert \mathbf {R}_z^\psi \mathbf {R}_x^\theta \mathbf {R}_z^\phi \mathbf {e}_z - \mathbf {e}_z \Vert ^2}_{a_{\psi , \phi }} \nonumber \\&\quad +\, \lambda _r^2 \underbrace{(\Vert \mathbf {R}_z^\psi \mathbf {R}_x^\theta \mathbf {R}_z^\phi \mathbf {e}_x -\mathbf {e}_x \Vert ^2 \!+ \Vert \mathbf {R}_z^\psi \mathbf {R}_x^\theta \mathbf {R}_z^\phi \mathbf {e}_y - \mathbf {e}_y \Vert ^2)}_{b_{\psi , \phi }}.\nonumber \\ \end{aligned}$$
(104)

Developing this expression thanks to the definition of the elementary rotations (102), we evaluate those terms into:

$$\begin{aligned} \left\{ \begin{aligned} a_{\psi , \phi }&= 2(1-\cos (\theta )) \\ b_{\psi , \phi }&= 4 - 2 \cos (\psi + \phi )(1 + \cos (\theta ))). \\ \end{aligned} \right. \end{aligned}$$
(105)

The first term is independent of \(\psi \) and \(\phi \). The second one can be minimized easily relatively to those two parameters, and admits a minimum that appears to be equal to the first term:

$$\begin{aligned} \min _{\psi , \phi } b_{\psi , \phi } = 2(1-\cos (\theta )). \end{aligned}$$
(106)

This result enables us to estimate the distance between the two poses in a closed form. However, having to refer to a relative rotation between the two poses and perform an Euler decomposition is cumbersome and would not enable to propose a representation of a pose efficient for neighborhood queries. We prefer instead to use the following property

$$\begin{aligned} \begin{aligned} 2(1-\cos (\theta ))&= \Vert \mathbf {R} \mathbf {e}_z - \mathbf {e}_z \Vert ^2 \\&= \Vert \mathbf {R}_2 \mathbf {e}_z - \mathbf {R}_1 \mathbf {e}_z \Vert ^2 \end{aligned} \end{aligned}$$
(107)

in order to express the rotation part of the square distance as a function of the distance between the revolution axes of the object at the two poses:

$$\begin{aligned} {{\mathrm{d_{rot}^2}}}(\mathscr {P}_1, \mathscr {P}_2) = (\lambda _r^2 + \lambda _z^2) \Vert \mathbf {R}_2 \mathbf {e}_z - \mathbf {R}_1 \mathbf {e}_z \Vert ^2. \end{aligned}$$
(108)

Appendix B: Minimum Distance Between Representatives of the Same Pose

In this appendix, we show how to compute the minimum distance T between representatives of the same pose (see the Definition 3) for the objects of our application example.

The bunny and the candlestick admit one representative per pose, hence \(T=+\infty \) for those by convention.

The case of the rocket requires some calculus. For the sake of simplicity we consider an object frame whose z axis corresponds to the symmetry axis of the rocket. In this frame, the proper symmetry group of the rocket can be expressed as

$$\begin{aligned} G= \left\{ \mathbf {I}, \mathbf {R}_z^{2 \pi /3}, \mathbf {R}_z^{-2 \pi /3} \right\} \end{aligned}$$
(109)

and the square root of the covariance matrix as

$$\begin{aligned} \varvec{\varLambda }= {{\mathrm{diag}}}(\lambda _r, \lambda _r, \lambda _z). \end{aligned}$$
(110)

We choose to consider the reference pose \(\mathscr {P}_0\) and one of its representatives \(\mathbf {p}\) (underbraced below) for the computation of T as it makes the computation simpler. Representatives \(\mathscr {R}(\mathscr {P}_0)\) of this pose are

$$\begin{aligned} \left\{ \underbrace{\left( \begin{array}{c} {{\mathrm{vec}}}(\varvec{\varLambda }) \\ \mathbf {0}_3 \end{array} \right) }_{\mathbf {p}}, \left( \begin{array}{c} {{\mathrm{vec}}}(\mathbf {R}_z^{2 \pi /3} \varvec{\varLambda }) \\ \mathbf {0}_3 \end{array} \right) , \left( \begin{array}{c} {{\mathrm{vec}}}(\mathbf {R}_z^{-2 \pi /3} \varvec{\varLambda }) \\ \mathbf {0}_3 \end{array} \right) \right\} . \end{aligned}$$
(111)

Thanks to those choices, we can evaluate T into:

$$\begin{aligned} \begin{aligned} T&= \min _{\mathbf {q} \in \mathscr {R}(\mathscr {P}_0), \mathbf {q}\ne \mathbf {p}} \Vert \mathbf {q} - \mathbf {p} \Vert \\&= \min \Vert \mathbf {R}_z^{\pm 2 \pi /3} \varvec{\varLambda }- \varvec{\varLambda }\Vert _F\\&= \sqrt{6} \lambda _r. \end{aligned} \end{aligned}$$
(112)

The threshold \(\frac{T}{4}\) of Proposition 9 therefore corresponds for the rocket to the value \(\frac{\sqrt{3}}{2} \lambda _r\).

Appendix C: Numerical Recipes for a Triangular Mesh

Center of mass, area and covariance matrix of the surface of a triangular mesh \(\mathscr {S} = \bigcup _i \mathscr {T}(\mathbf {a}_i, \mathbf {b}_i, \mathbf {c}_i)\)—where \(\mathscr {T}(\mathbf {a}, \mathbf {b}, \mathbf {c})\) is a triangle defined by three vertices \(\mathbf {a}, \mathbf {b}, \mathbf {c} \in \mathbb {R}^3\)—can be computed easily through the contributions of its triangles.

Let \(\mathscr {T}(\mathbf {a}, \mathbf {b}, \mathbf {c})\) be a given triangle. Its area can be computed thanks to a cross product:

$$\begin{aligned} S_{\mathbf {a}, \mathbf {b}, \mathbf {c}} = \frac{\Vert (\mathbf {b} - \mathbf {a}) \times (\mathbf {c} - \mathbf {a})\Vert }{2}, \end{aligned}$$
(113)

its center of mass through:

$$\begin{aligned} \mathbf {o}_{\mathbf {a}, \mathbf {b}, \mathbf {c}} = \frac{\mathbf {a} + \mathbf {b} + \mathbf {c}}{3}, \end{aligned}$$
(114)

and its uncentered covariance matrix via:

$$\begin{aligned} \varvec{\sigma }_{\mathbf {a}, \mathbf {b}, \mathbf {c}} = \frac{S_{\mathbf {a}, \mathbf {b}, \mathbf {c}}}{12} \left( 9 \mathbf {o}_{\mathbf {a}, \mathbf {b}, \mathbf {c}} \mathbf {o}_{\mathbf {a}, \mathbf {b}, \mathbf {c}}^\top + \mathbf {a} \mathbf {a}^\top + \mathbf {b} \mathbf {b}^\top + \mathbf {c} \mathbf {c}^\top \right) . \end{aligned}$$
(115)

From those results, we deduce the expression of the surface area of the mesh:

$$\begin{aligned} S= \sum _i S_{\mathbf {a}_i, \mathbf {b}_i, \mathbf {c}_i}, \end{aligned}$$
(116)

its center of mass:

$$\begin{aligned} \mathbf {o}= \sum _i S_{\mathbf {a}_i, \mathbf {b}_i, \mathbf {c}_i} \mathbf {o}_{\mathbf {a}_i, \mathbf {b}_i, \mathbf {c}_i}, \end{aligned}$$
(117)

and its normalized covariance matrix, if the center of mass of the mesh is chosen as origin of the object frame:

$$\begin{aligned} \varvec{\varLambda }^2 = \frac{1}{S} \sum _i \varvec{\sigma }_{\mathbf {a}_i, \mathbf {b}_i, \mathbf {c}_i}. \end{aligned}$$
(118)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brégier, R., Devernay, F., Leyrit, L. et al. Defining the Pose of Any 3D Rigid Object and an Associated Distance. Int J Comput Vis 126, 571–596 (2018). https://doi.org/10.1007/s11263-017-1052-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-017-1052-4

Keywords

Navigation