Skip to main content
Log in

Rotation Averaging

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper is conceived as a tutorial on rotation averaging, summarizing the research that has been carried out in this area; it discusses methods for single-view and multiple-view rotation averaging, as well as providing proofs of convergence and convexity in many cases. However, at the same time it contains many new results, which were developed to fill gaps in knowledge, answering fundamental questions such as radius of convergence of the algorithms, and existence of local minima. These matters, or even proofs of correctness have in many cases not been considered in the Computer Vision literature. We consider three main problems: single rotation averaging, in which a single rotation is computed starting from several measurements; multiple-rotation averaging, in which absolute orientations are computed from several relative orientation measurements; and conjugate rotation averaging, which relates a pair of coordinate frames. This last is related to the hand-eye coordination problem and to multiple-camera calibration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. La notion générale de variété est assez dificile à définir avec précision. [The general notion of a manifold is rather difficult to define with precision.] (Cartan 1951, p. 56.)

  2. For convenience of notation, we consider the index \(n\) to mean \(0\), so that \(\mathtt{{R}}_{i+1}\) means \(\mathtt{{R}}_0\) and \(\mathtt{{R}}_{i,i+1}\) means \(\mathtt{{R}}_{n-1,0}\) when \(i = n-1\).

References

  • Absil, P.-A., Mahony, R., & Sepulchre, R. (2008). Optimization algorithms on matrix manifolds. Princeton, NJ: Princeton University Press (With a foreword by Paul Van Dooren).

  • Afsari, B. (2011). Riemannian \(L^p\) center of mass: Existence, uniqueness, and convexity. Proceedings of the American Mathematical Society, 139(2), 655–673.

    Article  MathSciNet  MATH  Google Scholar 

  • Agrawal, M. (2006). A Lie algebraic approach for consistent pose registration for general euclidean motion. In International conference on intelligent robots and systems (pp. 1891–1897), October 2006.

  • Altmann, S. L. (1986). Rotations, quaternions, and double groups. New York: Oxford Science Publications/The Clarendon Press Oxford University Press.

    MATH  Google Scholar 

  • Asgharbeygi, N., & Maleki, A. (2008). Geodesic k-means clustering. In 19th international conference on pattern recognition, ICPR 2008 (pp. 1–4), December 2008.

  • Baker, P., Fermüller, C., Aloimonos, Y., & Pless, R. (2001). A spherical eye from multiple cameras (makes better models of the world). In Proceedings of IEEE conference on computer vision and pattern recognition (Vol. 1, p. 576). Los Alamitos, CA: IEEE Computer Society.

  • Beltrami, E. (1868). Teoria fondamentale degli spazii di curvatura costante. Annali di Matematica pura ed Applicata, II (2nd series) (pp. 232–255).

  • Buchholz, S., & Sommer, G. (2005). On averaging in Clifford groups. Computer Algebra and Geometric Algebra with Applications (pp. 229–238). Berlin: Springer.

  • Cartan, É. (1951). Leçons sur la géométrie des espaces de Riemann (2nd ed.). Paris: Gauthier-Villars.

    MATH  Google Scholar 

  • Clipp, B., Kim, J.-H., Frahm, J.-M., Pollefeys, M., & Hartley, R. (2008). Robust 6DOF motion estimation for non-overlapping multi-camera systems. In Workshop on applications of computer vision, WACV08 (pp. 1–8), January 2008.

  • Corcuera, J. M., & Kendall, W. S. (1999). Riemannian barycentres and geodesic convexity. Mathematical Proceedings of the Cambridge Philosophical Society, 127, 253–269.

    Article  MathSciNet  MATH  Google Scholar 

  • Dai, Y., Trumpf, J., Li, H., Barnes, N., & Hartley, R. (2009). Rotation averaging with application to camera-rig calibration. In Proceedings of Asian conference on computer vision, Xian .

  • Daniilidis, K. (1998). Hand-eye calibration using dual quaternions. International Journal of Robotics Research, 18, 286–298.

    Google Scholar 

  • Devarajan, D., & Radke, R. J. (2007). Calibrating distributed camera networks using belief propagation. EURASIP Journal on Advances in Signal Processing, 1, 2007.

    Google Scholar 

  • Eckhardt, U. (1980). Weber’s problem and Weiszfeld’s algorithm in general spaces. Mathematical Programming, 18(1), 186–196.

    Article  MathSciNet  MATH  Google Scholar 

  • Edelman, A., Arias, T. A., & Smith, S. T. (1998). The geometry of algorithms with orthogonality constraints. SIAM Journal on Matrix Analysis and Applications, 20(2), 303–353.

    Article  MathSciNet  MATH  Google Scholar 

  • Esquivel, S., Woelk, F., & Koch, R. (2007). Calibration of a multi-camera rig from non-overlapping views. In In DAGM07 (pp. 82–91).

  • Fiori, S., & Tanaka, T. (2008). An averaging method for a committee of special-orthogonal-group machines. In IEEE international symposium on circuits and systems, ISCAS 2008 (pp. 2170–2173), May 2008.

  • Fletcher, P., Lu, C., & Joshi, S. (2003). Statistics of shape via principal geodesic analysis on lie groups. In Proceedings of IEEE conference on computer vision and, pattern recognition (Vol. 1, pp. I-95–I-101), June 2003.

  • Fletcher, P. T., Venkatasubramanian, S., & Joshi, S. (2009). The geometric median on Riemannian manifolds with applications to robust atlas estimation. Neuroimage, 45(1 Suppl), 143–152.

    Article  Google Scholar 

  • Goodall, C. (1991). Procrustes methods in the statistical analysis of shape. Journal of the Royal Statistical Society, B, 53(2), 285– 339.

    MathSciNet  MATH  Google Scholar 

  • Govindu, V. M. (2001). Combining two-view constraints for motion estimation. In Proceedings of IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 218–225). IEEE Computer Society: Los Alamitos, CA.

  • Govindu, V. M. (2004). Lie-algebraic averaging for globally consistent motion estimation. In Proceedings of IEEE conference on computer vision and pattern recognition (Vol. 1, pp. 684–691). Los Alamitos, CA: IEEE Computer Society.

  • Govindu, V. M. (2006). Robustness in motion averaging. In Proceedings of Asian conference on computer vision (pp. 457–466).

  • Gramkow, C. (2001). On averaging rotations. International Journal of Computer Vision, 42(1–2), 7–16.

    Article  MathSciNet  MATH  Google Scholar 

  • Grove, K., Karcher, H., & Ruh, E. A. (1974). Jacobi fields and Finsler metrics on compact Lie groups with an application to differentiable pinching problems. Mathematische Annalen, 211, 7–21.

    Article  MathSciNet  MATH  Google Scholar 

  • Hartley, R., Aftab, K., & Trumpf, J. (2011). Rotation averaging using the Weiszfeld algorithm. In Proceedings of IEEE conference on computer vision and pattern recognition.

  • Hartley, R., & Kahl, F. (2009). Global optimization through rotation space search. International Journal of Computer Vision, 82(1), 64–79.

    Article  Google Scholar 

  • Hartley, R., & Schaffalitzky, F. (2004). \({L}_\infty \) minimization in geometric reconstruction problems. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. I-504–I-509), Washington DC, June 2004.

  • Hartley, R., & Trumpf, J. (2012). Characterization of weakly convex sets in projective space. Technical report, Australian National University.

  • Hartley, R., Trumpf, J., & Dai, Y. (2010). Rotation averaging and weak convexity. In Proceedings of the 19th international symposium on mathematical theory of networks and systems (MTNS) (pp. 2435–2442).

  • Hartley, R., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Horn, B. K. P., Hilden, H., & Negahdaripour, S. (1988). Closed-form solution of absolute orientation using orthonormal matrices. Journal of the Optical Society of America, 5(7), 1127–1135.

    Article  MathSciNet  Google Scholar 

  • Humbert, M., Gey, N., Muller, J., & Esling, C. (1996). Determination of a mean orientation from a cloud of orientations. Application to electron back-scattering pattern measurements. Journal of Applied Crystallography, 29(6), 662–666.

    Article  Google Scholar 

  • Humbert, M., Gey, N., Muller, J., & Esling, C. (1998). Response to Morawiec’s (1998) comment on Determination of a mean orientation from a cloud of orientations. Application to electron back-scattering pattern measurements. Journal of Applied Crystallography, 31(3), 485.

    Article  Google Scholar 

  • Hüper, K. (2002). A calculus approach to matrix eigenvalue algorithms. Habilitationsschrift, Universität Würzburg, Germany, July.

  • Kahl, F. (2005). Multiple view geometry and the \({L}_\infty \)-norm. In Proceedings of international conference on computer vision (pp. 1002–1009).

  • Kahl, F., & Hartley, R. (2008). Multiple view geometry under the \(L_\infty \)-norm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(9), 1603–1617.

    Article  Google Scholar 

  • Kanatani, K. (1990). Group-theoretical methods in image understanding. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Communications on Pure and Applied Mathematics, 30(5), 509–541.

    Article  MathSciNet  MATH  Google Scholar 

  • Kaucic, R., Hartley, R., & Dano, N. (2001). Plane-based projective reconstruction. In Proceedings of 8th international conference on computer vision (pp. I-420–I-427), Vancouver, Canada.

  • Kim, J.-H., Hartley, R., Frahm, J.-M., & Pollefeys, M. (2007). Visual odometry for non-overlapping views using second-order cone programming. In Proceedings of Asian conference on computer vision (Vol. 2, pp. 353–362), November 2007.

  • Kim, J.-H., Li, H., & Hartley, R. (2008). Motion estimation for multi-camera systems using global optimization. In Proceedings of IEEE conference on computer Vision and pattern recognition.

  • Kim, J.-H., Li, H., & Hartley, R. (2010). Motion estimation for non-overlapping multi-camera rigs: Linear algebraic and \(L_\infty \) geometric solutions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6), 1044–1059.

    Article  Google Scholar 

  • Krakowski, K., Hüper, K., & Manton, J. (2007). On the computation of the Karcher mean on spheres and special orthogonal groups. In RoboMat, workshop on robotics and mathematics. Portugal: Coimbra.

  • Kumar, R., Ilie, A., Frahm, J.-M., & Pollefeys, M. (June 2008). Simple calibration of non-overlapping cameras with a mirror. In Proceedings of IEEE conference on computer vision and pattern recognition.

  • Le, H. (2001). Locating Fréchet means with application to shape spaces. Advances in Applied Probability, 33, 324–338.

    Article  MathSciNet  MATH  Google Scholar 

  • Le, H. (2004). Estimation of Riemannian barycentres. LMS Journal of Computation and Mathematics, 7, 193–200.

    MathSciNet  MATH  Google Scholar 

  • Lébraly, P., Deymier, C., Ait-Aider, O., Royer, E., & Dhome M. (2010). Flexible extrinsic calibration of non-overlapping cameras using a planar mirror: Application to vision-based robotics. In 2010 IEEE/RSJ International Conference on Intelligent robots and systems (IROS) (pp. 5640–5647). Taipei: IEEE.

  • Li, H., Hartley, R., & Kim, J.-H. (2008). Linear approach to motion estimation using generalized camera models. In Proceeding of IEEE conference on computer vision and pattern recognition.

  • Li, Y. (1998). A Newton acceleration of the Weiszfeld algorithm for minimizing the sum of euclidean distances. Computational Optimization and Applications, 10, 219–242.

    Article  MathSciNet  MATH  Google Scholar 

  • Lu, F., & Milios, E. (1997). Globally consistent range scan alignment for environment mapping. Autonomous Robots, 4(4), 333–349.

    Article  Google Scholar 

  • Manton, J. H. (2004). A globally convergent numerical algorithm for computing the centre of mass on compact Lie groups. In Proceedings of the eighth international conference on control, automation, robotics and vision (pp. 2211–2216), Kunming, China, December 2004.

  • Markley, F., Cheng, Y., Crassidis, J., & Oshman, Y. (2007). Averaging quaternions. Journal of Guidance, Control, and Dynamics, 30(4), 1193–1197.

    Article  Google Scholar 

  • Martinec, D., & Pajdla, T. (June 2007). Robust rotation and translation estimation in multiview reconstruction. In Proceedings of IEEE conference on computer vision and pattern recognition.

  • Massey, W. (1977). Algebraic topology: An introduction. Berlin: Springer.

    Google Scholar 

  • Moakher, M. (2002). Means and averaging in the group of rotations. SIAM Journal on Matrix Analysis and Applications, 24(1), 1–16.

    Article  MathSciNet  MATH  Google Scholar 

  • Morawiec, A. (1998). Comment on Determination of a mean orientation from a cloud of orientations. Application to electron back-scattering pattern measurements by Humbert et al. (1996). Journal of Applied Crystallography, 31(3), 484.

    Article  Google Scholar 

  • Morawiec, A. (1998). A note on mean orientation. Journal of Applied Crystallography, 31(5), 818–819.

    Article  Google Scholar 

  • Morawiec, A. (2004). Orientations and rotations: Computations in crystallographic textures. Berlin: Springer.

    Book  Google Scholar 

  • Myers, S. (1945). Arcs and geodesics in metric spaces. Transactions of the American Mathematical Society, 57(2), 217–227.

    Article  MathSciNet  MATH  Google Scholar 

  • Nocedal, J., & Wright, S. (1999). Numerical optimization. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Ostresh, L. (1978). Convergence of a class of iterative methods for solving weber location problem. Operations Research, 26, 597–609.

    Article  MathSciNet  MATH  Google Scholar 

  • Park, F., & Martin, B. (1994). Robot sensor calibration: solving AX=XB on the euclidean group. IEEE Transactions on Robotics and Automation, 10(5), 717–721.

    Article  Google Scholar 

  • Pennec, X. (1998). Computing the mean of geometric features: Application to the mean rotation. Technical Report INRIA RR-3371, INRIA.

  • Pless, R. (2003). Using many cameras as one. In Proceedings of IEEE conference on computer vision and pattern recognition.

  • Qi, C., Gallivan, K. A., & Absil, P.-A. (2010). Riemannian BFGS algorithm with applications. In M. Diehl, F. Glineur, E. Jarlebring, & W. Michiels (Eds.), Recent advances in optimization and its applications in engineering (pp. 183–192). Berlin: Springer.

    Chapter  Google Scholar 

  • Rinner, B., & Wolf, W. (2008). A bright future for distributed smart cameras. Processings of the IEEE, 96(10), 1562–1564.

    Article  Google Scholar 

  • Rockafellar, R. (1970). Convex analysis. Princeton, NJ: Princeton University Press.

    MATH  Google Scholar 

  • Rodrigues, R., Barreto, J., & Nunes, U. (2010). Camera pose estimation using images of planar mirror reflections. Computer Vision—ECCV, 2010, 382–395.

    Google Scholar 

  • Rother, C., & Carlsson, S. (2001). Linear multi view reconstruction and camera recovery. In Proceedings of 8th international conference on computer vision (pp. I-42–I-49), Vancouver, Canada.

  • Sarlette, A., & Sepulchre, R. (2009). Consensus optimization on manifolds. SIAM Journal on Control and Optimization, 48(1), 56–76.

    Google Scholar 

  • Sim, K., & Hartley, R. (2006). Recovering camera motion using \({L}_{\infty }\) minimization. In Proceedings of IEEE conference on computer vision and pattern recognition, New York City.

  • Steiner, J. (1826). Einige Gesetze über die Theilung der Ebene und des Raumes. Journal für Die Reine Und Angewandte Mathematik, 1, 349–364.

    Article  MATH  Google Scholar 

  • Strobl, K., & Hirzinger, G. (2006) . Optimal hand-eye calibration. In 2006 IEEE/RSJ international conference on intelligent robots and systems (pp. 4647–4653), October 2006.

  • Sturm, P., & Bonfort, T. (2006). How to compute the pose of an object without a direct view? Computer Vision—ACCV, 2006, 21–31.

    Google Scholar 

  • Subbarao, R., & Meer, P. (2009). Nonlinear mean shift over Riemannian manifolds. International Journal of Computer Vision, 84(1), 1–20.

    Google Scholar 

  • Teller, S., Antone, M., Bodnar, Z., Bosse, M., Coorg, S., Jethwa, M., et al. (2003). Calibrated, registered images of an extended urban area. International Journal of Computer Vision, 53(1), 93–107.

    Google Scholar 

  • Tron, R., Vidal, R., & Terzis, A. (2008). Distributed pose averaging in camera networks via consensus on SE(3). In Second ACM/IEEE international conference on distributed smart cameras, September 2008.

  • Weber, A. (1909). Über den Standort der Industrien. Teil 1, Reine Theorie des Standorts. Tübingen: J.C.B. Mohr.

    Google Scholar 

  • Weiszfeld, E. (1937). Sur le point pour lequel la somme des distances de n points donnes est minimum. Tohoku Mathematical Journal, 43, 355–386.

    Google Scholar 

  • Wu, F., Wang, Z., & Hu, Z. (2009). Cayley transformation and numerical stability of calibration equation. International Journal of Computer Vision, 82(2), 156–184.

    Google Scholar 

  • Yang, L. (2010). Riemannian median and its estimation. LMS Journal of Computation and Mathematics, 13, 461–479.

    Google Scholar 

  • Zhang, H. (1998). Hand/eye calibration for electronic assembly robots. IEEE Transactions on Robotics and Automation, 14(4), 612–616.

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by NICTA, a research laboratory funded by the Australian Government, in part through the Australian Research Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Hartley.

Appendices

Appendix: Convexity

Of major relevance to questions of convergence and uniqueness of solutions of averaging problems is determining if and where the defined cost functions are convex functions.

In this section we consider the question of convexity of a function measuring distance in \(\mathrm{SO}(3)\) from a given rotation \(\mathtt{{R}}\). Since we are dealing with a function defined on \(\mathrm{SO}(3)\), rather than a Euclidean space, we will need the concept of geodesic convexity to analyze this problem.

The general definition of convexity of a function in \(\mathbb R ^n\) is as follows. Given a convex region \(U \subset \mathbb R ^n\) a function \(f\) defined on \(U\) is convex if for any two points \(\mathbf{x}_0\) and \(\mathbf{x}_1\) in \(U\), and any point \(\mathbf{y}\) lying on the line segment bounded by \(\mathbf{x}_0\) and \(\mathbf{x}_1\), given by \(\mathbf{y} = (1 - \lambda ) \mathbf{x}_0 + \lambda \mathbf{x}_1\) with \(0 \le \lambda \le 1\), we have

$$\begin{aligned} f(\mathbf{y}) \le (1 - \lambda ) f(\mathbf{x}_0) + \lambda f(\mathbf{x}_1). \end{aligned}$$

In adapting this definition to \(\mathrm{SO}(3)\), or indeed to any Riemannian or differentiable manifold, the role of a line is naturally taken by a geodesic. The appropriate definition of a convex set in \(\mathrm{SO}(3)\) is a little less clear, and will be considered next.

1.1 Convex Sets in \(\mathrm{SO}(3)\)

As discussed in Sect. 4 the geodesics on \(\mathrm{SO}(3)\) are doubly covered by great circles on \(S^3\) and there is a uniform length scaling by a factor of \(2\) between the geodesics on \(\mathrm{SO}(3)\) and those on \(S^3\). In particular, we see that the geodesics on \(\mathrm{SO}(3)\) are closed curves with a total length of \(2\pi \). There are exactly two geodesic segments between any two points in \(\mathrm{SO}(3)\) (without exception). Given two points (rotations) \(\mathtt{{R}}_0\) and \(\mathtt{{R}}_1\) in \(\mathrm{SO}(3)\), we call the shorter of the two geodesic segments from \(\mathtt{{R}}_0\) to \(\mathtt{{R}}_1\) the short geodesic segment between these points. If \(\mathtt{{R}}_0\) and \(\mathtt{{R}}_1\) differ by a rotation through \(\pi \), then which of the two geodesic segments is the shorter one is ambiguous and hence there is no short geodesic segment between such points.

For convenience, we repeat definition 1, which defines two slightly different notions of geodesic convexity of sets in \(\mathrm{SO}(3)\). (The definition is generalizable to other manifolds.)

Definition 2

A non-empty region \(U \subset \mathrm{SO}(3)\) is called weakly convex if for any two points \(\mathtt{{R}}_0\) and \(\mathtt{{R}}_1\) in \(U\) exactly one geodesic segment from \(\mathtt{{R}}_0\) to \(\mathtt{{R}}_1\) lies entirely inside \(U\).

A weakly convex region \(U\subset \mathrm{SO}(3)\) is called convex if the geodesic segment from \(\mathtt{{R}}_0\) to \(\mathtt{{R}}_1\) in \(U\) is always the short geodesic segment between these points, having length strictly smaller than \(\pi \).

The empty set is not considered to be convex or weakly convex.

A closed ball of radius \(r\ge 0\) in \(\mathrm{SO}(3)\) is a set

$$\begin{aligned} B(\mathtt{{R}}, r) = \{\mathtt{{S}} \in \mathrm{SO}(3)\, | \, d_{\angle }(\mathtt{{S}}, \mathtt{{R}}) \le r\} \end{aligned}$$

for some \(\mathtt{{R}}\) in \(\mathrm{SO}(3)\).

Radius and Diameter. We introduce two useful pieces of terminology, the radius and diameter of a set. The diameter of a set \(C\) in \(\mathrm{SO}(3)\) is the supremum of \(d_{\angle }(\mathtt{{R}}, \mathtt{{S}})\) over all \(\mathtt{{R}}, \mathtt{{S}} \in C\). According to this definition, the diameter of a convex set is at most equal to \(\pi \), moreover, no two points in the set actually achieve this bound.

An open ball of radius \(r>0\) in \(\mathrm{SO}(3)\), denoted \(\mathring{B}(\mathtt{{R}}, r)\), is the interior of the closed ball, consisting of rotations at distance strictly less than \(r\) from \(\mathtt{{R}}\). We emphasize for clarity that the balls \(B(\mathtt{{R}}, r)\) or \(\mathring{B}(\mathtt{{R}}, r)\) are defined in terms of the geodesic (angular) distance on \(\mathrm{SO}(3)\).

The radius of a set \(C\) in \(\mathrm{SO}(3)\) is the infimum of all \(r\) such that \(C\) is contained in some ball of radius \(r\). It is evident by the triangle inequality that radius is at least half the diameter of the set.

Lemma 6

A closed ball in \(\mathrm{SO}(3)\) is convex if and only if its radius is less than \(\pi /2\). Similarly, an open ball in \(\mathrm{SO}(3)\) is convex if and only if its radius is less than or equal to \(\pi /2\). A closed ball in \(\mathrm{SO}(3)\) is weakly convex if and only if its radius is less than \(\pi \), and an open ball in \(\mathrm{SO}(3)\) is weakly convex if and only if its radius is less than or equal to \(\pi \).

If we visualize this in terms of the quaternion sphere, the proof is straightforward, and hence omitted. Note that an open ball of radius \(\pi \) is the whole of \(\mathrm{SO}(3)\) except for one plane, consisting of rotations at distance \(\pi \) from the centre of the ball.

Convex and weakly convex subsets of \(\mathrm{SO}(3)\) can not be arbitrarily “large”, in the following precise sense.

Theorem 10

Any weakly convex subset of \(\mathrm{SO}(3)\) is contained in an open ball of radius \(\pi \). In other words, there exists a plane in \(\mathrm{SO}(3)\) (the boundary of the open ball) that does not meet the said weakly convex set. Any convex subset of \(\mathrm{SO}(3)\) is contained in a closed ball of radius \(2\pi /3\).

The proof of this theorem turns out to be surprisingly difficult (particularly the first part) and will be reported elsewhere (Hartley and Trumpf 2012). As a consequence of this result we may picture any weakly convex subset of \(\mathrm{SO}(3)\) simply as a convex set in \(\mathbb R ^3\) under a suitably chosen gnomonic projection, namely the one mapping the boundary of the containing ball of radius \(\pi \) to the plane at infinity (cf. Sect. 3.4). This is because the gnomonic projection maps geodesics to geodesics, and hence weakly convex sets to convex sets.

Although we provide no proof here, we nevertheless make frequent use of the result of Theorem 10 for weakly convex sets. However, in a sense the rest of the paper does not depend on this result, as long as we are willing to modify the definition of weakly convex set to include the (redundant) condition that such a set lies inside an open ball of radius \(\pi \).

According to this theorem, the radius of a convex set is at most \(2\pi /3\), and a closed convex set must have radius strictly less than \(2\pi /3\). On the other hand, lemma 6 states that a convex ball can have radius no greater than \(\pi /2\). It is therefore somewhat surprising that we claim that a ball of radius \(2\pi /3\) is required to contain any convex set. This bound is tight however, as a simple example shows. Consider a regular tetrahedron in \(\mathbb R ^3\), centred at the origin. The inverse gnomonic map will take this to a tetrahedron in \(\mathrm{SO}(3)\) bounded by geodesic planes. Let the size of this tetrahedron be such that its vertices are at geodesic distance \(2\pi /3\) from its centre. Knowing that the angle \(\alpha \) between the vectors from the origin to any two vertices of a regular tetrahedron is given by \(\cos (\alpha ) = -1/3\), it may be verified directly using (13) (the cosine rule) that the angular distance between two vertices of the tetrahedron is equal to \(\pi \). It follows from this that for each vertex \(A\) of the tetrahedron, the whole geodesic plane passing through the three other vertices lies at distance \(\pi \) from \(A\). Consequently, no two points in the tetrahedron lie at a greater distance than \(\pi \) from each other. The interior of the tetrahedron is therefore convex, contained in a closed ball of radius \(2\pi / 3\), but not in any closed ball of lesser radius.

Observe that we may add a single vertex (or even the whole boundary, less one face) to this tetrahedron and it will still be convex, but will not lie in an open ball of radius \(2\pi /3\); thus we cannot replace the words “closed ball” with “open ball” in the theorem statement. Furthermore, the complete closed tetrahedron (although weakly convex) is not convex, since it contains points at an angular distance \(\pi \) from each other.

Some results about weakly convex sets in \(\mathrm{SO}(3)\) follow easily from corresponding statements about convex sets in \(\mathbb R ^3\).

Proposition 3

Let \(B\) be a set in \(\mathrm{SO}(3)\).

  1. 1.

    If \(B\) is a weakly convex set of radius \(r < \pi \), then the closure of \(B\) is weakly convex.

  2. 2.

    If \(B\) is a convex set of diameter \(d < \pi \), then the closure of \(B\) is convex.

  3. 3.

    If \(B\) is a closed or open weakly convex set, then for any point \(\mathbf{x} \not \in B\), there exists a plane through \(\mathbf{x}\) that does not intersect \(B\).

  4. 4.

    If \(B\) is a closed or open weakly convex set, then \(B = \mathrm{SO}(3)\,\backslash \, \bigcup \Pi _i\), where \(\Pi _i\) runs over all planes not intersecting \(B\).

Proof

We select a plane not containing \(B\) and map it to the plane at infinity. The set \(B\) is thereby mapped to a convex set in \(\mathbb R ^3\). In the case when \(B\) has radius \(r < \pi \), this mapping can be chosen so that \(B\) maps to a bounded set. The four parts of the theorem then all follow from properties of convex sets in \(\mathbb R ^n\). The corresponding properties of sets in \(\mathbb R ^n\) are not quite trivial. The reader is referred to Rockafellar (1970) for the required proofs.\(\square \)

Separation properties of convex sets by planes are important in the study of convex sets in \(\mathbb R ^n\). The basic separability property in \(\mathbb R ^n\) is that two disjoint convex open sets are separable by a plane (Rockafellar 1970, Theorem 11.3). As the following results show, similar properties hold for weakly convex sets in \(\mathrm{SO}(3)\), but this does not follow immediately from the \(\mathbb R ^n\) case. The necessary modification reflects the fact that a single plane in \(\mathrm{SO}(3)\) does not separate \(\mathrm{SO}(3)\) into two parts (but two planes do).

Proposition 4

If \(S\) and \(T\) are two disjoint open weakly convex sets in \(\mathrm{SO}(3)\), then there exists a plane \(\Pi \) that intersects neither of them.

Proof

Consider a plane disjoint from \(S\), and identify it as \(\Pi _\infty \), the plane at infinity. If \(\Pi _\infty \) is disjoint from \(T\), then it is the required plane. Otherwise, \(T\) is cut into two parts by \(\Pi _\infty \), such that \(T_1 \cup T_2 = T \,\backslash \, \Pi _\infty \), and \(T_1\) and \(T_2\) are open convex sets in \(\mathbb R ^3\). We form the set \(S^{\prime } = \bigcup L(\mathbf{x}, \mathbf{y})\) where \(L(\mathbf{x}, \mathbf{y})\) is a line segment in \(\mathbb R ^3\) joining a point \(\mathbf{x}\in S\) and a point \(\mathbf{y} \in T_1\), and \(S^{\prime }\) is the union of all such line segments. We claim that \(S^{\prime }\) is the convex hull (in \(\mathbb R ^3\)) of \(S \cup T_1\).

To see this, consider two points \(\mathbf{a}\) and \(\mathbf{b}\) in \(S^{\prime }\), where \(\mathbf{a}\) is on a line \(L(\mathbf{x}_1, \mathbf{y}_1)\) and \(\mathbf{b}\) is on a line \(L(\mathbf{x}_2, \mathbf{y}_2)\). Now, the points \(\mathbf{x}_1, \mathbf{x}_2, \mathbf{y}_1\) and \(\mathbf{y}_2\) are the vertices of a tetrahedron. (The case where the four points are coplanar is a special case which is easily treated separately.) This tetrahedron is convex, and hence contains the line segment from \(\mathbf{a}\) to \(\mathbf{b}\). Furthermore, every point in the tetrahedron lies on some line with endpoints in the line segments \(\mathbf{x}_1 \mathbf{x}_2\) and \(\mathbf{y}_1 \mathbf{y}_2\), which lie inside \(S\) and \(T_1\) respectively. Hence the whole tetrahedron, and in particular the line segment from \(\mathbf{a}\) to \(\mathbf{b}\), lies inside \(S^{\prime }\).

Now, we claim that this convex set \(S^{\prime }\) is disjoint from \(T_2\). In particular, if a point \(\mathbf{a}\in T_2\) lies on the line segment \(L(\mathbf{x}, \mathbf{y})\), with \(\mathbf{x} \in S, \mathbf{y} \in T_1\), then both \(\mathbf{a}\) and \(\mathbf{y}\) lie in \(T\), which is by assumption weakly convex. A line segment from \(\mathbf{a}\) to \(\mathbf{y}\) in \(T\) must pass through the plane at infinity \(\Pi _\infty \), since \(T_1\) and \(T_2\) are different connected components of \(T \,\backslash \, \Pi _\infty \). However, in this case, this line segment must pass through \(\mathbf{x}\), which contradicts the assumption that \(S\) and \(T\) are disjoint.

Therefore, the sets \(T_2\) and \(S^{\prime }\) are disjoint and convex in \(\mathbb R ^3\). Theorem 11.3 of Rockafellar (1970) ensures that there exists a plane \(\Pi \) separating \(S^{\prime }\) from \(T_2\). This plane is therefore disjoint from both \(S\) and \(T\), except possibly on the plane \(\Pi _\infty \). However, since both \(S\) and \(T\) are assumed open, it is not possible for the plane \(\Pi \) to intersect \(S\) or \(T\) only on the plane at infinity.

This completes the construction of the plane disjoint from \(S\) and \(T\).\(\square \)

The previous proposition allows us to show that two open weakly convex sets may be separated by two planes.

Proposition 5

If \(S\) and \(T\) are two disjoint open weakly convex sets in \(\mathrm{SO}(3)\), then there exist two planes \(\Pi _1\) and \(\Pi _2\) such that \(S\) and \(T\) lie in different components of \(\mathrm{SO}(3)\,\backslash (\Pi _1 \cup \Pi _2)\).

Proof

There is a plane \(\Pi _1\) that meets neither of \(S\) and \(T\). Map this plane to infinity. Then \(S\) and \(T\) are mapped to two open convex sets in \(\mathbb R ^3\), which are therefore separable by a plane \(\Pi _2\). These are the two required planes.\(\square \)

Another separation property of convex sets in \(R^n\) that carries over, slightly modified to weakly convex sets in \(\mathrm{SO}(3)\) is the existence of supporting planes.

Proposition 6

Let \(S\) be a closed convex set in \(\mathrm{SO}(3), \mathtt{{R}}\) a point not in \(S\) and \(\mathtt{{T}}\) a closest point in \(S\) to \(\mathtt{{R}}\). Further, let \(\Pi \) be the plane through \(\mathtt{{T}}\) perpendicular to the line \(\mathtt{{R}} \mathtt{{T}}\). Then, the plane \(\Pi \) divides the open ball \(B = \mathring{B}(\mathtt{{T}}, \pi )\) into two half-balls, and \(S\) lies entirely in the closed half ball not containing \(\mathtt{{R}}\). Consequently, the interior of \(S\) lies in the open half ball not containing \(\mathtt{{R}}\).

This situation is illustrated in Fig. 4. The proposition holds in a more general context than in \(\mathrm{SO}(3)\), but we give a proof only for \(\mathrm{SO}(3)\), using the cosine rule.

Fig. 4
figure 4

The supporting plane constructed in Proposition 6

Proof

If the distance \(\mathtt{{R}}\mathtt{{T}}\) is equal to \(\pi \), then the whole of the set \(S\) lies in the plane \(\Pi (R, \pi )\), and the result is trivially true. Therefore, assume that the distance \(\mathtt{{R}}\mathtt{{T}}\) is less than \(\pi \). Since \(S\) is convex, any point in \(S\) lies at distance less than \(\pi \) from \(\mathtt{{T}}\).

Via a gnomonic mapping centred at \(\mathtt{{T}}\), the ball \(B\) maps to the whole of \(\mathbb R ^3\), the set \(S\) maps to a closed bounded convex set and angles at \(\mathtt{{T}}\) are preserved. We may therefore use this gnomonic model to access familiar concepts concerning sets in \(\mathbb R ^3\).

Suppose that there is a point \(\mathtt{{X}}\) in \(S\) on the same side of \(\Pi \) as \(\mathtt{{R}}\). Since \(S\) is convex, the whole of the line \(\mathtt{{T}}\mathtt{{X}}\) lies in \(S\). Furthermore, it forms an angle \(\gamma < \pi /2\) with the line \(\mathtt{{T}}\mathtt{{R}}\). Let \(\mathtt{{X}}_t\) be a point on the line \(\mathtt{{T}}\mathtt{{X}}\) at distance \(t\) from \(\mathtt{{T}}\) in the direction towards \(\mathtt{{X}}\).

Applying the cosine rule (Proposition 2) to the triangle \(\mathtt{{R}} \mathtt{{X}}_t \mathtt{{T}}\) as shown in Fig. 5, we see that

$$\begin{aligned} \cos \left(\frac{c(t)}{2}\right) = \left| \cos \left(\frac{t}{2}\right) \cos \left(\frac{b}{2}\right) + \sin \left(\frac{t}{2}\right) \sin \left(\frac{b}{2}\right) \cos \left(\gamma \right) \right| , \end{aligned}$$

where we write \(c(t)\) in recognition that the length \(c\) depends on the value of \(t\). Since \(0 \le t < \pi \) and \(0 \le b < \pi \), we see that for \(\gamma < \pi /2\) the expression inside the absolute value \(|\cdot |\) is positive, so

$$\begin{aligned} c(t) = 2 \arccos \left( \cos \left(\frac{t}{2}\right) \cos \left(\frac{b}{2}\right) + \sin \left(\frac{t}{2}\right) \sin \left(\frac{b}{2}\right) \cos \left(\gamma \right) \right) . \end{aligned}$$

Taking derivatives with respect to \(t\) at \(t = 0\), we find \(d c / d t |_{t = 0} = -\cos (\gamma )\), which is negative when \(\gamma < \pi /2\). Thus, for sufficiently small \(t\) we have \(c(t) < c(0) = b\). Thus, the point \(\mathtt{{X}}_t\) is closer to \(\mathtt{{R}}\) than the distance \(\mathtt{{R}}\mathtt{{T}}\), which contradicts the assumption that \(\mathtt{{T}}\) is the closest point in \(S\) to \(\mathtt{{R}}\). The conclusion is that the open half ball containing \(\mathtt{{R}}\) contains no point of \(S\), as required.\(\square \)

Fig. 5
figure 5

The gnomonic model used in the proof of Proposition 6

1.2 Intersections of Weakly Convex Sets

We now consider various properties of intersections of convex and weakly convex sets in \(\mathrm{SO}(3)\) in a series of propositions. In the following discussion, we will use the language of projective geometry, speaking of lines and planes, instead of geodesics and geodesic planes. These relate to the geometric properties of \(\mathrm{SO}(3)\), considered as the projective plane \(\mathbb P ^3\), in which geodesics play the role of lines in projective geometry. Note that the concept of weakly convex set is purely a property of the projective geometry of \(\mathrm{SO}(3)\), viewed as a projective plane \(\mathbb P ^3\); a set \(S\) is weakly convex if any two points in \(S\) are joined by a single line segment contained in \(S\). According to Theorem 10, for any weakly convex set \(S\) there exists a plane that does not intersect \(S\).

We consider families of convex sets \(B_i\), indexed by \(i\) in some index set \(I\), finite or infinite.

Proposition 7

The intersection of a family of convex sets in \(\mathrm{SO}(3)\) is convex or empty.

Proof

If points \(x\) and \(y\) are in the intersection of a family of convex sets \(B_i\) then the shortest geodesic from \(x\) to \(y\) lies in each \(B_i\), and hence in their intersection. Thus the intersection is convex.\(\square \)

Proposition 8

Consider a family of weakly convex sets \(B_i\) in \(\mathrm{SO}(3)\). If there exists a plane \(\Pi \) disjoint from all of them, then their intersection is weakly convex or empty.

Proof

Consider two points \(x\) and \(y\) in \(\bigcap _{i\in I}\, B_i\). There exist two geodesic line segments joining \(x\) to \(y\) which together make up a complete closed geodesic. One of these line segments meets the plane \(\Pi \), and hence does not lie completely inside any of the \(B_i\). Since each \(B_i\) is weakly convex, the other line segment joining \(x\) to \(y\) must lie in \(B_i\). Since this is true for all \(i\), this line segment lies in the intersection of all the sets \(B_i\), which is therefore weakly convex.\(\square \)

Proposition 9

If \(B\) is a weakly convex set in \(\mathrm{SO}(3)\) and \(\Pi \) is a plane then \(B \cap \Pi \) is either empty or weakly convex. Further, \(B \setminus \Pi \) consists of at most two weakly convex components.

Proof

That \(B \cap \Pi \) is weakly convex unless it is empty is easily shown; we therefore turn to consider \(B \setminus \Pi \).

If \(\Pi \) does not intersect \(B\) then \(B \setminus \Pi =B\). Otherwise, according to Theorem 10 there exists a plane \(\Pi ^{\prime }\) that does not intersect \(B\), and this must be different from \(\Pi \), since \(\Pi \) intersects \(B\). By a suitable homography, we may map \(\Pi ^{\prime }\) to the plane at infinity. The set \(B\) maps to a convex set in \(\mathbb R ^3\) and \(\Pi \) to a plane in \(\mathbb R ^3\). From properties of convex sets in \(\mathbb R ^3\), the plane \(\Pi \) divides \(B\) into at most two parts, each of which is convex in \(\mathbb R ^3\), and hence weakly convex as a subset of \(\mathrm{SO}(3)\). Note that this also covers the case where \(B \setminus \Pi \) is empty. \(\square \)

Proposition 10

If \(B_i, \, i \in I\) is a family of weakly convex sets in \(\mathrm{SO}(3)\), then any connected component of \(\bigcap _{i\in I} \, B_i\) is weakly convex.

Proof

We select one \(B_i\) and choose a plane \(\Pi \) that it does not intersect. Then

$$\begin{aligned} \bigcap _{j\in I}\,B_j = \bigcap _{j\in I} \, (B_j \, \backslash \, \Pi ) . \end{aligned}$$

Now, let \(\mathbf{x}\) be a point in \(\bigcap _{j\in I}\,B_j\), and for any \(j\in I\) let \(B_j^{\prime }\) be the component of \(B_j \,\backslash \, \Pi \) which contains \(\mathbf{x}\). It is weakly convex by Proposition 9. Then \(\bigcap _{j\in I} \, B_j^{\prime }\) is the component of \(\bigcap _{j\in I} \, B_j\) containing \(\mathbf{x}\). It is weakly convex by Proposition 8. Since \(\mathbf{x}\) was arbitrary, every component is weakly convex.\(\square \)

Proposition 11

If \(B_i, \, i = 1, \ldots , n \) are a finite family of weakly convex sets in \(\mathrm{SO}(3)\), then their intersection consists of at most \(\genfrac(){0.0pt}{}{n}{3}+n\) disjoint weakly convex components.

Proof

The connected components are weakly convex by Proposition 10. We simply need to estimate how many such components there are. For each \(B_i\), select a plane \(\Pi _i\) that it does not intersect. The union of planes \(\Pi _i\) is disjoint from the intersection of the sets \(B_i\).

Now, map the first plane \(\Pi _1\) to the plane at infinity via a homography. The other \(n-1\) planes divide \(\mathbb R ^3\) into convex regions \(V_j\). Generically (if no \(4\) planes meet in a point and no \(3\) planes meet in the same line) there are \(\genfrac(){0.0pt}{}{n}{3} + n\) such regions \(V_j\), but fewer in the non-generic case (Steiner 1826).

Each \(V_j\) is convex in \(\mathbb R ^3\) and hence weakly convex as a subset of \(\mathrm{SO}(3)\). Now,

$$\begin{aligned} V_j \cap \bigcap _{i=1}^n\, B_i = \bigcap _{i=1}^n\, (B_i \cap V_j) . \end{aligned}$$

However, each \(B_i \cap V_j\) is weakly convex by Proposition 8, since both \(B_i\) and \(V_j\) avoid \(\Pi _i\). Similarly, the total intersection is weakly convex, since each \(B_i \cap V_j\) avoids any and all of the planes \(\Pi _i\).

Thus, there is at most one weakly convex component of \(\bigcap _{i=1}^n\, B_i\) contained in each \(V_j\), and hence there are not more than \(\genfrac(){0.0pt}{}{n}{3} + n\) components in total.\(\square \)

1.3 Convex Hulls and Convex Basins

In the light of Proposition 7 we may define the convex hull of a set \(B \subset \mathrm{SO}(3)\) to be the minimal convex set (if one exists) that contains \(B\). If \(B\) is not empty, and as long as there exists at least one convex set containing \(B\), then the intersection of all such convex sets containing \(B\) is itself convex, and is therefore the convex hull of \(B\).

Since the intersection of weakly convex sets is not generally weakly convex we cannot define a weakly convex hull of a set of points in the same way. For example, a line segment of length less than \(2\pi \) is weakly convex, but the intersection of two line segments of length \(3/2 \pi \) arranged suitably on a single line will not be connected and hence not weakly convex. This is easily pictured thinking of lines (closed geodesics) as circles. Under certain circumstances, however, there will exist a smallest weakly convex set containing a set \(B\). We therefore make the following definition.

Definition 3

Let \(S\) be a set in \(\mathrm{SO}(3)\) and \(H\) a weakly convex set containing \(S\). If \(H\) is a subset of any other weakly convex set \(H^{\prime }\) that contains \(S\), then we say that the weakly convex hull of \(S\) exists, and is \(H\).

Thus, \(H\) is the minimal weakly convex set containing \(S\), if such a minimal set exists. Note that not every set has a weakly convex hull, even if it is contained in some weakly convex set. The empty set has no weakly convex hull since the empty set is not considered to be weakly convex.

We list some simple properties of weakly convex hulls.

Proposition 12

A nonempty set \(S\) in \(\mathrm{SO}(3)\) has a weakly convex hull if and only if the intersection of all weakly convex sets \(H_i\) containing \(S\) consists of a single connected component. This component is the weakly convex hull.

The proof is immediate.

Sets with weakly convex hulls can be characterized simply in terms of connectivity. A nonempty set \(S\) may be called convex-connected if whenever \(S\) is contained in the disjoint union of two open weakly convex sets, \(S \subset H_1 \cup H_2\), then either \(S \cap H_1\) or \(S \cap H_2\) is empty. Note that this is analogous to the usual definition of a connected set; in fact every connected set is convex-connected. It may seem more appropriate to say that \(S\) is weakly convex-connected, but this seems too verbose, so we choose this terminology.

Proposition 13

A nonempty set \(S\) in \(\mathrm{SO}(3)\) has a weakly convex hull if and only if it is contained in some weakly convex set and is convex-connected.

Proof

Suppose that \(S\) is convex-connected and contained in the weakly convex set \(B\). Let \(\Pi _S\) be a plane that does not intersect \(B\) (Theorem 10) and hence does not intersect \(S\). We define \(H = \bigcap _i B_i\) where \(B_i\) runs over all weakly convex sets containing \(S\). If we can show that \(H\) is itself weakly convex, then it is the weakly convex hull of \(S\). This will be accomplished by showing that

$$\begin{aligned} H = \bigcap _i B_i = \bigcap _i B^{\prime }_i \end{aligned}$$
(27)

where \(B^{\prime }_i\) is a weakly convex subset of \(B_i\) and \(B^{\prime }_i \cap \Pi _S = \emptyset \). In this case \(H\) is weakly convex according to Proposition 8.

To this end, let \(B_i\) be such a weakly convex set containing \(S\). The plane \(\Pi _S\) divides \(B_{i}\) into at most two weakly convex sets, \(B_i \setminus \Pi _S = B_i^1 \cup B_i^2\) (Proposition 9), where \(B_{i}^{2}\) may be empty. Since \(B_{i}\setminus \Pi _{S}\) contains \(S\), the other component \(B_i^1\) will then be nonempty. Now let \(\Pi _i\) be a plane not intersecting \(B_i\). Then \(\mathrm{SO}(3)\setminus (\Pi _S \cup \Pi _i)\) is a union of two disjoint open weakly convex sets, and it contains \(S\). Therefore, \(S\) is contained in one of these two sets, since \(S\) is assumed to be convex-connected. Furthermore, since either \(B_i^2\) is empty or \(B_i^1\) and \(B_i^2\) lie in different sets, it follows that \(S \subset B_i^1\) or \(S\subset B_i^2\). In particular, we may replace \(B_i\) in (27)by \(B_i^{\prime }\), where \(B_i^{\prime }\) is the component of \(B_i \,\backslash \Pi _{S}\) containing \(S\). This completes the demonstration that \(S\) has a weakly convex hull.

Conversely, suppose that \(S\) has a weakly convex hull \(H\), which is therefore a weakly convex set containing \(S\) and is the intersection of all weakly convex sets containing \(S\). Let \(H_{1}\) and \(H_{2}\) be two disjoint weakly convex open sets with \(S \subset H_{1} \cup H_{2}\). Let \(\Pi _1\) and \(\Pi _2\) be two planes such that \(H_{1}\) and \(H_{2}\) are in different components of \(\mathrm{SO}(3)\,\backslash (\Pi _1 \cup \Pi _2)\). These planes exist according to Proposition 5. Then \(\mathrm{SO}(3)\,\backslash \, \Pi _1\) and \(\mathrm{SO}(3)\,\backslash \, \Pi _2\), are both weakly convex sets containing \(S\). It follows that \(H\) is disjoint from both \(\Pi _1\) and \(\Pi _2\). Suppose neither \(S\cap H_{1}\) nor \(S\cap H_{2}\) is empty. Then \(S\), and hence \(H\) contains points from both components of \(\mathrm{SO}(3)\,\backslash (\Pi _1 \cup \Pi _2)\), so \(H\) cannot be connected. This is a contradiction since \(H\) is weakly convex, and leads to the conclusion that \(S\) is contained completely in one of the two sets \(H_1\) or \(H_2\). Hence \(S\) is convex-connected. \(\square \)

As a simple corollary of this result, a connected set \(S\) contained in some weakly convex set \(B\) has a weakly convex hull.

Convex Basins. We now turn to the study of convex basins of sets \(S\) in \(\mathrm{SO}(3)\). These will be important in defining the domain of convexity of sums of distance functions defined on \(\mathrm{SO}(3)\), in Sect. 5.

For \(\mathbf{x} \in \mathrm{SO}(3)\), define \(\Pi (\mathbf{x})\) to be the plane consisting of all points at distance \(\pi \) from \(\mathbf{x}\).

Let \(S\) be a set in \(\mathrm{SO}(3)\). We define the set

$$\begin{aligned} S^{\natural } = \bigcap _{\mathbf{x} \in S}\, \mathring{B}(\mathbf{x}, \pi ) = \mathrm{SO}(3)\,\backslash \, \bigcup _{\mathbf{x} \in S} \Pi (\mathbf{x}) , \end{aligned}$$

which will be called the convex basin of \(S\). The following implications are easily demonstrated for a point \(\mathbf{x}\) and set \(S\) in \(\mathrm{SO}(3)\), following directly from the definition of \(S^{\natural }\).

$$\begin{aligned} \mathbf{x} \in S^{\natural } \Leftrightarrow \Pi (\mathbf{x}) \cap S&= \emptyset \Leftrightarrow S \subset \mathring{B}(\mathbf{x}, \pi ), \end{aligned}$$
(28)
$$\begin{aligned} \mathbf{x} \in S \Rightarrow \Pi (\mathbf{x}) \cap S^{\natural }&= \emptyset \Leftrightarrow S^{\natural } \subset \mathring{B}(\mathbf{x}, \pi ) . \end{aligned}$$
(29)

Note that the implication on the left in (29)is not bidirectional; for example, \(\Pi (\mathbf{y})^{\natural }=\emptyset \) for any \(\mathbf{y}\in \mathrm{SO}(3)\).

We give some properties of convex basins.

Proposition 14

If \(S\) is a weakly convex set then so is \(S^{\natural }\); in particular, \(S^{\natural }\) is connected.

Proof

Consider two points \(\mathbf{y}_0\) and \(\mathbf{y}_1\) in \(S^{\natural }\), lying on a line \(L\) and dividing \(L\) into two line segments \(L_0\) and \(L_1\). We show that one of the line segments \(L_i\) lies entirely in \(S^{\natural }\). Assume the contrary; thus for \(i = 1, 2\), there exist points \(\mathbf{x}_0 \in L_0\) and \(\mathbf{x}_1 \in L_1\) with \(\mathbf{x}_i \not \in S^{\natural }\).

Therefore, by (28) there exist points \(\mathbf{x}_i^{\prime } \in S\) such that \(\mathbf{x}_i^{\prime } \in \Pi (\mathbf{x}_i)\) or, equivalently, such that \(\mathbf{x}_i \in \Pi (\mathbf{x}_i^{\prime })\). Since \(S\) is weakly convex, there exist points \(\mathbf{x}^{\prime }_t \in S\), for \(t \in [0, 1]\) tracing out the line segment from \(\mathbf{x}^{\prime }_0\) to \(\mathbf{x}^{\prime }_1\). For each \(t\), let \(\mathbf{x}_t = L \cap \Pi (\mathbf{x}^{\prime }_t)\). Note that this intersection must be a single point, since \(\Pi (\mathbf{x}^{\prime }_t)\) does not contain the line \(L\) because \(\mathbf{y}_i \in S^{\natural }\) lies on \(L\). Also, for \(t=0\) and \(t=1\) we recover our previous points \(\mathbf{x}_0\) and \(\mathbf{x}_1\), respectively. Then \(\mathbf{x}_{t}^{\prime }\in \Pi (\mathbf{x}_{t})\cap S\) and \(\mathbf{x}_t \not \in S^{\natural }\) by (28) Furthermore, \(\mathbf{x}_t\) traces out a path from \(\mathbf{x}_0\) to \(\mathbf{x}_1\) on \(L\). This path must pass through \(\mathbf{y}_0\) or \(\mathbf{y}_1\), contradicting the assumption that \(\mathbf{y}_0, \mathbf{y}_1 \in S^{\natural }\).

On the other hand, the whole line \(L = L_1 \cup L_2\) cannot lie in \(S^{\natural }\), since if \(\mathbf{x}\) is any point in \(S\), then \(\Pi (\mathbf{x}) \cap L\) is non-empty (a plane and a line must meet). Thus some point in \(L\) is not in \(S^{\natural }\), unless \(S\) is empty.\(\square \)

Proposition 15

If \(S\) has a weakly convex hull \(H\), then \(S^{\natural } = H^{\natural }\); in particular, \(S^{\natural }\) is weakly convex.

Proof

Since \(S\subset H\), it follows easily that \(H^{\natural } \subset S^{\natural }\). Now, let \(\mathbf{x} \in S^{\natural }\), so \(S \subset \mathring{B}(\mathbf{x}, \pi )\) by (28) This is a weakly convex set containing \(S\). Since \(H\) is the minimal weakly convex set containing \(S\), it follows that \(H \subset \mathring{B}(\mathbf{x}, \pi )\), and so \(\mathbf{x} \in H^{\natural }\) (again by (28). Hence, \(S^{\natural } \subset H^{\natural }\), and the result follows.\(\square \)

Proposition 16

If \(S\) is connected, then so is \(S^{\natural }\).

Proof

Since \(S\) is connected, it is convex-connected. If there exists some plane \(\Pi \) disjoint from \(S\), then Proposition 13 shows that \(S\) has a weakly convex hull, so by Proposition 15, \(S^{\natural }\) is weakly convex, hence connected.

On the other hand if each plane \(\Pi \) meets \(S\), consider a point \(\mathbf{x} \in \mathrm{SO}(3)\). Since \(\Pi (\mathbf{x}) \cap S \ne \emptyset \), it follows (from (28)) that \(\mathbf{x} \not \in S^{\natural }\). Thus \(S^{\natural }\) is empty, and hence connected. \(\square \)

Proposition 17

If \(S\) is an open set then \(S^{\natural }\) is closed. If \(S\) is closed, then \(S^{\natural }\) is open.

Proof

It is easily seen that if \(B\) is an open ball then \(B^{\natural }\) is a closed ball. Now if \(S\) is open, then it is the union of open balls \(B_i\). Consequently, \(S^{\natural } = \bigcap _i \, B^{\natural }_i\), which is closed.

Next, suppose \(S\) is closed and consider a convergent sequence of points \(\mathbf{x}_i\) in \(\mathrm{SO}(3)\setminus S^{\natural } = \bigcup _{\mathbf{y} \in S} \, \Pi (\mathbf{y})\). We wish to show that their limit point \(\mathbf{x}_\mathrm{lim}\) is also in \(\mathrm{SO}(3)\setminus S^{\natural }\). This would imply that \(\mathrm{SO}(3)\setminus S^{\natural }\) is closed, so \(S^{\natural }\) is open.

We choose points \(\mathbf{y}_i\) in \(S\) such that \(\mathbf{x}_i \in \Pi (\mathbf{y}_i)\). Since \(S\) is closed, hence compact, there exists a convergent subsequence of \(\mathbf{y}_i\) converging to a point \(\mathbf{y}_\mathrm{lim}\) in \(S\). Select a value \(\varepsilon > 0\). There exist points \(\mathbf{y}_i\) and \(\mathbf{x}_i\) such that \(d(\mathbf{y}_i, \mathbf{y}_\mathrm{lim}) < \varepsilon , d(\mathbf{x}_i, \mathbf{x}_{\lim }) < \varepsilon \), and by definition \(d(\mathbf{y}_i, \mathbf{x}_i) = \pi \). By the triangle inequality, \(\pi -2\varepsilon < d(\mathbf{x}_\mathrm{lim}, \mathbf{y}_\mathrm{lim}) < \pi + 2\varepsilon \). Since \(\varepsilon \) is arbitrary, it follows that \(d(\mathbf{x}_\mathrm{lim}, \mathbf{y}_\mathrm{lim}) = \pi \). Since \(\mathbf{y}_\mathrm{lim} \in S\), it follows that \(\mathbf{x}_\mathrm{lim} \in \mathrm{SO}(3)\setminus S^{\natural }\). \(\square \)

The following result shows that the relationship \(S \leftrightarrow S^{\natural }\) is a dual relationship between open and closed weakly convex sets.

Proposition 18

If \(S\) is an open or closed weakly convex set then \(S^{\natural \natural } = S\).

Proof

If \(\mathbf{x} \in S\) then \(\Pi (\mathbf{x}) \cap S^{\natural } = \emptyset \), by (29) Then by (28) \(\mathbf{x} \in S^{\natural \natural }\), so \(S\) is contained in \(S^{\natural \natural }\). To show the inverse inclusion, let \(\mathbf{x}\) be a point not in \(S\). As remarked in Proposition 3, there exists a plane through \(\mathbf{x}\) that does not intersect \(S\). Let this plane be \(\Pi (\mathbf{x}^{\prime })\). Then \(\mathbf{x}^{\prime } \in S^{\natural }\) (by (28), and so \(\Pi (\mathbf{x}^{\prime })\cap S^{\natural \natural } = \emptyset \) (by (29). In particular \(\mathbf{x} \not \in S^{\natural \natural }\). \(\square \)

Proposition 19

If \(S\) is contained in a convex set \(H\), then \(H\) is contained in a single connected component of \(S^{\natural }\). In particular, if \(S\) is itself convex, then \(S^{\natural }\) is a weakly-convex set containing \(S\).

Proof

Since the distance between two points in \(H\) is less than \(\pi \), no plane \(\Pi (\mathbf{x}), \mathbf{x} \in S\) will intersect with \(H\). Consequently, \(\bigcup _{\mathbf{x} \in S} \, \Pi (\mathbf{x})\) is disjoint from \(H\), and \(H\) lies fully inside \(S^{\natural } = \mathrm{SO}(3)\setminus \bigcup _{\mathbf{x} \in S} \, \Pi (\mathbf{x})\). Since \(H\) is connected it lies within a single connected component of this set. \(\square \)

Examples. Let \(S\) be the closed ball \(B(\mathtt{{S}}, r)\), with \(r < \pi \). Then \(S^{\natural }\) is the open ball \(\mathring{B}(\mathtt{{S}}, \pi - r)\). Similarly, if \(S\) is the open ball \(\mathring{B}(\mathtt{{S}}, r)\) with \(r \le \pi \), then \(S^{\natural }\) is the closed ball \(B(\mathtt{{S}}, \pi - r)\).

In particular when \(r = \pi /2\) and \(S = \mathring{B}(\mathtt{{S}}, \pi /2)\), then \(S^{\natural } = B(\mathtt{{S}}, \pi /2)\). This is a special case of Proposition 19.

1.4 Convex Functions in \(\mathrm{SO}(3)\)

Convex functions can be defined as in \(\mathbb R ^n\), except that geodesic curves in \(\mathrm{SO}(3)\) take the place of straight lines joining two points in \(\mathbb R ^n\). To make this explicit, we need the following terminology, requiring geodesic curves to be parametrized to have constant speed.

A geodesic curve in \(\mathrm{SO}(3)\) is a constant speed path along a geodesic. Here, we think of speed as being defined in terms of the angle metric in \(\mathrm{SO}(3)\), but either of the other metrics \(d_\mathrm{chord}\) or \(d_\mathrm{quat}\) can be used instead, since they result in the same path length (except for scale).

Definition 4

Consider a function \(f: U \rightarrow {\text{ I}\!\text{ R}}\) defined on a weakly convex subset \(U\) of \(\mathrm{SO}(3)\). Let \(\mathbf{x}_0, \mathbf{x}_1 \in U\) and let \(g: [0, 1] \rightarrow U\) be a geodesic curve from \(\mathbf{x}_0\) to \(\mathbf{x}_1\) in \(U\), such that \(g(0) = \mathbf{x}_0\) and \(g(1) = \mathbf{x}_1\). The function \(f\) is called convex, if for any such \(\mathbf{x}_0, \mathbf{x}_1\) and \(g\), we have an inequality

$$\begin{aligned} f(g(\lambda )) \le (1-\lambda ) f(\mathbf{x}_0) + \lambda f(\mathbf{x}_1) \end{aligned}$$

for all \(\lambda \in [0,1]\). The function is called strictly convex if this inequality is strict for all \(\lambda \in (0, 1)\) whenever \(\mathbf{x}_0\ne \mathbf{x}_1\).

Various properties of convex functions hold true, just as with convex functions in \(\mathbb R ^n\).

Proposition 20

  1. 1.

    The sum of convex (or strictly convex) functions defined on a weakly convex region \(U\) is convex (respectively, strictly convex) .

  2. 2.

    A strictly convex function defined on a weakly convex set has at most a single local minimum, which is therefore the global minimum; for convex functions (even if they are not strictly convex), any local minimum is a global minimum and the minima form a weakly convex set on which the function is constant.

The proof is the same as for convex functions in \(\mathbb R ^{n}\).

Convexity of functions can be defined locally through computing the second derivative of their restriction along geodesic paths through a point.

Definition 5

A function \(f : \mathrm{SO}(3)\rightarrow {\text{ I}\!\text{ R}}\) is locally convex at a point \(\mathtt{{R}}_0 \in \mathrm{SO}(3)\) if for any constant speed geodesic path \(\gamma : [-1, 1] \rightarrow \mathrm{SO}(3)\), with \(\gamma (0) = \mathtt{{R}}_0\) the function \(f\circ \gamma (t) = f(\gamma (t))\) has non-negative second derivative at \(t = 0\). It is locally strictly convex at \(\mathtt{{R}}_{0}\) if any such \(f\circ \gamma (t)\) has positive second derivative at \(t = 0\).

The connection between local convexity and convexity is as follows.

Proposition 21

If \(f : \mathrm{SO}(3)\rightarrow {\text{ I}\!\text{ R}}\) is smooth and locally convex (or strictly convex) at each point in a weakly convex set \(U\), except possibly at isolated global minima of \(f\), then it is convex (respectively, strictly convex) in \(U\). If \(f : \mathrm{SO}(3)\rightarrow {\text{ I}\!\text{ R}}\) is smooth but not locally convex at some point then it is not convex in any non-trivial ball around that point.

Next we investigate when the function \(d(\mathtt{{S}}, \mathtt{{R}})\) defined for two rotations is a convex function of \(\mathtt{{S}}\) (for fixed \(\mathtt{{R}}\)).

Theorem 11

(Convexity of metrics) Consider the function \(f(\mathtt{{S}}) = d(\mathtt{{S}}, \mathtt{{R}})^p\) for a fixed rotation \(\mathtt{{R}}\), a metric \(d(\cdot , \cdot )\), and an exponent \(p\). The function is convex, or strictly convex, as a function of \(\mathtt{{S}}\) in the following cases.

  1. 1.

    \(d_{\angle }(\cdot , \mathtt{{R}})\) is convex on the set \(\mathring{B}(\mathtt{{R}}, \pi )\).

  2. 2.

    \(d_\mathrm{chord}(\cdot , \mathtt{{R}})\) is not convex on any non-trivial ball around \(\mathtt{{R}}\).

  3. 3.

    \(d_\mathrm{quat}(\cdot , \mathtt{{R}})\) is not convex on any non-trivial ball around \(\mathtt{{R}}\).

  4. 4.

    \(d_{\angle }(\cdot , \mathtt{{R}})^2\) is strictly convex on the set \(\mathring{B}(\mathtt{{R}}, \pi )\).

  5. 5.

    \(d_\mathrm{chord}(\cdot , \mathtt{{R}})^2\) is strictly convex on the set \(B(\mathtt{{R}}, \pi /2)\).

  6. 6.

    \(d_\mathrm{quat}(\cdot , \mathtt{{R}})^2\) is strictly convex on the set \(\mathring{B}(\mathtt{{R}}, \pi )\).

Compare these results to the graphs in Fig. 2 in Sect. 4. From these graphs, parts 2 and 3 of the theorem are evident. It is also clear that \(d_{\angle }(\cdot , \mathtt{{R}})\) is not strictly convex anywhere. The other parts of the theorem are obtained by direct computation of second derivatives. Details of how these values are computed and a table of Hessians and gradients are found in Table 3 in the following appendix.

1.5 Two Geometric Lemmas

The following two lemmas are used in the proof of Theorem 5.

Lemma 7

(Pumping lemma) Let \(B\) be a closed convex subset of \(\mathrm{SO}(3)\) then there exists a larger closed convex subset \(\hat{B}\) of \(\mathrm{SO}(3)\) such that all points of \(B\) lie in the interior of \(\hat{B}\). Furthermore, the intersection of all such sets \(\hat{B}\) is equal to \(B\).

Proof

If \(B\) is a closed convex set, then its diameter must be strictly less than \(\pi \). Let \(\varepsilon \) be a number such that \(\mathrm{diameter}(B) + 4\varepsilon < \pi \). Now, let \(\Gamma \) be the gnomonic map based at some point in \(B\). This takes \(B\) to a closed bounded convex set \(\Gamma (B)\) in \(\mathbb R ^3\). Let \(N_\varepsilon (\Gamma (B))\) be an \(\varepsilon \)-neighbourhood of \(\Gamma (B)\), that is, the union of closed balls of radius \(\varepsilon \) centred on points of \(\Gamma (B)\). This is a closed convex set in \(\mathbb R ^3\) containing \(\Gamma (B)\) in its interior. Let \(B^{\prime } = \Gamma ^{-1} (N_\varepsilon (\Gamma (B)))\), which is a closed weakly convex set in \(\mathrm{SO}(3)\). To show that \(B^{\prime }\) is convex, it remains to show that the diameter of \(B^{\prime }\) is less than \(\pi \).

The gnomonic map expands distances. More exactly, elementary trigonometry shows that \(\Vert \Gamma (\mathtt{{R}}) - \Gamma (\mathtt{{S}})\Vert > \alpha = d_{\angle }(\mathtt{{R}}, \mathtt{{S}})/2\), where \(\alpha \) is the angle between \(\mathtt{{R}}\) and \(\mathtt{{S}}\) on the unit quaternion sphere. In particular, the inverse image under \(\Gamma ^{-1}\) of a closed ball of radius \(\varepsilon \) in \(\mathbb R ^3\) is a set of radius less than \(2\varepsilon \) in \(\mathrm{SO}(3)\). It follows using the triangle inequality that the diameter of \(B^{\prime }\) is no more than \(\mathrm{diameter}(B) + 4\varepsilon < \pi \).\(\square \)

Lemma 8

Theorem 5 is true in the special case where \(B\) is a closed convex set and the rotations \(\mathtt{{R}}_i\) lie in the interior of \(B\).

Proof

Let \(B\) be a closed convex set containing all \(\mathtt{{R}}_i\) in its interior and let \(\mathtt{{R}}\) be a point not in \(B\). We will show that \(\mathtt{{R}}\) cannot be the point that minimizes the cost \(C_f(\mathtt{{R}})\) by explicitly computing a point \(\mathtt{{R}}^{\prime }\) with lesser cost. Since \(B\) is compact, there exists a point \(\mathtt{{T}} \in B\) that minimizes the distance to \(\mathtt{{R}}\). There may be more than one such point \(\mathtt{{T}}\), but we take any one. We observe first that \(d_{\angle }(\mathtt{{R}}, \mathtt{{T}}) < \pi \), since if this is not true, then \(\mathtt{{T}}\) and hence every point in \(B\) must be at distance \(\pi \) (the maximum possible distance) from \(\mathtt{{R}}\). In this case \(B\) lies in the plane at distance \(\pi \) from \(\mathtt{{R}}\), and hence has empty interior, contrary to assumption.

Now, if we were in \(\mathbb R ^n\), we could argue that \(d_{\angle }(\mathtt{{T}}, \mathtt{{R}}_i) < d_{\angle }(\mathtt{{R}}, \mathtt{{R}}_i)\), for any point \(\mathtt{{R}}_i \in B\), but this is not true in \(\mathrm{SO}(3)\). Instead we find a point \(\mathtt{{R}}^{\prime }\) such that \(d_{\angle }(\mathtt{{R}}^{\prime }, \mathtt{{R}}_i) < d_{\angle }(\mathtt{{R}}, \mathtt{{R}}_i)\), and hence \(d_{i}(\mathtt{{R}}^{\prime })<d_{i}(\mathtt{{R}})\), which proves that \(\mathtt{{R}}\) is not the point that minimizes \(C_f\).

The point \(\mathtt{{R}}^{\prime }\) is constructed as follows. Consider the minimal geodesic from \(\mathtt{{R}}\) to \(\mathtt{{T}}\) and continue it beyond \(\mathtt{{T}}\) by the same distance to a point \(\mathtt{{R}}^{\prime }\). Thus \(d_{\angle }(\mathtt{{T}}, \mathtt{{R}}) = d_{\angle }(\mathtt{{T}}, \mathtt{{R}}^{\prime }) < \pi \). We do not claim that \(\mathtt{{R}}^{\prime } \in B\), or that \(\mathtt{{R}}^{\prime }\) minimizes the cost function. Next, consider the plane \(\Pi \) passing through \(\mathtt{{T}}\) perpendicular to the geodesic from \(\mathtt{{R}}\) to \(\mathtt{{T}}\). The configuration described here satisfies the hypotheses of Proposition 6.

Now, we consider the gnomonic projection \(\Gamma \) centred at \(\mathtt{{T}}\). Since the diameter of \(B\) is less than \(\pi \), and \(\mathtt{{T}} \in B\), the whole of \(B\) is mapped to a bounded convex set in \(\mathbb R ^3\). Similarly, the shortest geodesic from \(\mathtt{{R}}\) to \(\mathtt{{T}}\) maps to a bounded line segment in \(\mathbb R ^3\), not meeting the interior of \(\Gamma (B)\), and the plane \(\Pi \) maps to a plane in \(\mathbb R ^3\). Since the gnomonic map preserves angles at the base point, \(\Gamma (\Pi )\) is perpendicular to the line from \(\Gamma (\mathtt{{R}})\) to \(\Gamma (\mathtt{{T}})\).

According to Proposition 6, the plane \(\Gamma (\Pi )\) separates \(\mathbb R ^3\) into two half-spaces, with the interior of \(\Gamma (B)\) and \(\Gamma (\mathtt{{R}}^{\prime })\) lying in one half space, and \(\Gamma (\mathtt{{R}})\) in the other. This is shown in Fig. 6. For a point \(\mathtt{{S}} \in \mathring{B}\), we claim that the angle \(\mathtt{{R}} \mathtt{{T}} \mathtt{{S}}\) is greater than \(\pi /2\). This is obvious for the corresponding points in \(\mathbb R ^3\) since \(\Gamma (\mathtt{{S}})\) is separated from \(\Gamma (\mathtt{{R}})\) by the plane \(\Gamma (\Pi )\) which passes through \(\Gamma (\mathtt{{T}})\). Since the gnomonic projection preserves angles at the base point, the claim is valid in \(\mathrm{SO}(3)\). Since \(\mathtt{{R}}, \mathtt{{T}}\) and \(\mathtt{{R}}^{\prime }\) lie on a single geodesic, it follows that the angle \(\mathtt{{R}}^{\prime } \mathtt{{T}} \mathtt{{S}} < \pi /2\).

Fig. 6
figure 6

The supporting plane in the gnomonic picture.

To complete the proof, it is sufficient to show that \(d_{\angle }(\mathtt{{S}}, \mathtt{{R}}^{\prime }) < d_{\angle }(\mathtt{{S}}, \mathtt{{R}})\). Note that we can not appeal to the gnomonic projection to demonstrate this claim, which would be obvious in \(\mathbb R ^3\), since the gnomonic projection does not preserve lengths. Furthermore, we do not know whether the shortest geodesics from \(\mathtt{{S}}\) to \(\mathtt{{R}}\) or \(\mathtt{{R}}^{\prime }\) cross the plane at infinity in the gnomonic projection or not.

To prove the claim, we appeal to the cosine rule (13) to compute geodesic lengths in \(\mathrm{SO}(3)\). We use notation as shown in Fig. 7, where \(c = d_{\angle }(\mathtt{{S}}, \mathtt{{R}}) \le \pi \) and \(c^{\prime } = d_{\angle }(\mathtt{{S}}, \mathtt{{R}}^{\prime }) \le \pi \). Since \(\gamma + \gamma ^{\prime } = \pi \), it follows that \(\cos (\gamma ) = -\cos (\gamma ^{\prime })\). Then applying the cosine rule, we find

$$\begin{aligned} \cos \left(\frac{c}{2}\right)&= \left|\cos \left(\frac{a}{2}\right) \cos \left(\frac{b}{2}\right) - \sin \left(\frac{a}{2}\right) \sin \left(\frac{b}{2}\right) \cos \left(\gamma ^{\prime }\right)\right| \\ \cos \left(\frac{c^{\prime }}{2}\right)&= \left|\cos \left(\frac{a}{2}\right) \cos \left(\frac{b}{2}\right) + \sin \left(\frac{a}{2}\right) \sin \left(\frac{b}{2}\right) \cos \left(\gamma ^{\prime }\right)\right| \end{aligned}$$

Now, \(0 < a < \pi \) and \(0 < b< \pi \), so \(\sin (a/2) \sin (b/2) > 0\), and \(\cos (a/2) \cos (b/2) > 0\). Furthermore \(\cos (\gamma ^{\prime }) > 0\), since \(\gamma ^{\prime } < \pi /2\). It follows easily that \(\cos (c^{\prime }/2) > \cos (c/2) \ge 0\), so \(c^{\prime } < c\) as required.\(\square \)

Fig. 7
figure 7

Notation used in proving that \(c = d_{\angle }(\mathtt{{S}}, \mathtt{{R}}) > c^{\prime } = d_{\angle }(\mathtt{{S}}, \mathtt{{R}}^{\prime })\)

Appendix: Gradients and Hessians

Given a function \(f: \mathrm{SO}(3)\rightarrow \mathbb R \), we wish to define and calculate the gradient and Hessian of this function. These entities may be expressed in terms of the exponential map at the point of interest. Let \(\exp _{\mathtt{{R}}} : \mathbb R ^3 \rightarrow \mathrm{SO}(3)\) be the exponential map at a point \(\mathtt{{R}} \in \mathrm{SO}(3)\), defined by \(\exp _{\mathtt{{R}}} [\mathbf{v}]_\times = \mathtt{{R}} \exp [\mathbf{v}]_\times \). The gradient and Hessian of the function \(f\) at the point \(\mathtt{{R}}\) are defined as the gradient and Hessian (the matrix of second derivatives) of the function \(f \circ \exp _{\mathtt{{R}}} : \mathbb R ^3 \rightarrow \mathbb R \), evaluated at \(\mathbf{v} = \mathbf{0}\).

This definition corresponds with the notion of Riemannian gradient and Hessian, which are defined on the tangent space \(T_{\mathtt{{R}}}(\mathrm{SO}(3))\) to \(\mathrm{SO}(3)\) at the point \(\mathtt{{R}}\). In this more abstract context, the Hessian is a quadratic form defined on the tangent space. If we identify \(\mathbb R ^3\) with its standard Euclidean basis as the tangent space, this quadratic form is represented by the symmetric second derivative matrix defined here.

We have defined the concept of convexity of a function defined on \(\mathrm{SO}(3)\) in terms of the values of the function along geodesics.

Theorem 12

If the Hessian of a function \(f: \mathrm{SO}(3)\rightarrow \mathbb R \) is positive semi-definite at a point \(\mathtt{{R}}_0\in \mathrm{SO}(3)\), then \(f\) is locally convex at \(\mathtt{{R}}_0\). If the Hessian is positive definite, then the function is locally strictly convex.

Proof

Let \(\gamma : \mathbb R \rightarrow \mathrm{SO}(3)\) be a constant speed geodesic path with \(\gamma (0) = \mathtt{{R}}_0\). We may pull \(\gamma \) back to a path \(\tilde{\gamma }: \mathbb R \rightarrow \mathbb R ^3\) such that \(\gamma = \exp _{\mathtt{{R}}_0} \circ \, \tilde{\gamma }\). To show that \(f\) is locally convex, we need to show that \(f \circ \gamma (t) = f \circ \exp _{\mathtt{{R}}_0} \circ \, \tilde{\gamma } (t)\) has non-negative second derivative at \(t = 0\). However, the second derivative may be written as \(\mathbf{v}^{\top }\mathtt{{H}} \mathbf{v}\), where \(\mathtt{{H}}\) is the Hessian of \(f \circ \exp _{\mathtt{{R}}_0}\) and \(\mathbf{v} = \tilde{\gamma }^{\prime }(0)\) is the derivative of \(\tilde{\gamma }\). If the Hessian is positive definite (or semi-definite), this is positive (non-negative) as required. \(\square \)

Thus, to show that a function on \(\mathrm{SO}(3)\) is convex, it is sufficient to show that its Hessian is positive definite, except possibly at isolated local minima.

Gradient and Hessian of Distance Functions. Consider \(\mathtt{{S}} \in \mathrm{SO}(3)\) and let \(f(\mathtt{{R}}) = d^p(\mathtt{{R}}, \mathtt{{S}})\) where \(d(\cdot , \cdot )\) is some bi-invariant metric defined on \(\mathrm{SO}(3)\). By definition, \(\mathtt{{H}}_f\) is the Hessian of the function

$$\begin{aligned} \tilde{f}(\mathbf{x}) = f(\mathtt{{R}} \exp [\mathbf{x}]_\times ) = d^p(\exp [\mathbf{x}]_\times , \mathtt{{R}}^{\top }\mathtt{{S}}). \end{aligned}$$

Define \(\theta = d_{\angle }(\exp [\mathbf{x}]_\times , \mathtt{{R}}^{\top }\mathtt{{S}})\) and let \(\mathtt{{R}}^{\top }\mathtt{{S}}\) be a rotation through angle \(\theta _0\) about unit axis \(\hat{\mathbf{w}}\). Then, using the cosine rule (13) we may write

$$\begin{aligned} \cos \left(\frac{\theta }{2}\right) = \cos \left(\frac{\Vert \mathbf{x}\Vert }{2}\right) \cos \left(\frac{\theta _0}{2} \right) + \sin \left(\frac{\Vert \mathbf{x}\Vert }{2}\right) \sin \left(\frac{\theta _0}{2} \right) \frac{\left<\mathbf{x},\,\hat{\mathbf{w}} \right>}{\Vert \mathbf{x}\Vert _2}. \end{aligned}$$

Since we wish to take derivatives up to second order, we may replace this by its second-order approximation, yielding

$$\begin{aligned} \theta \approx 2 \arccos \left(\! \left(1 - \frac{\Vert \mathbf{x}\Vert ^2}{8}\right) \cos (\frac{\theta _0}{2}) + \frac{1}{2}\sin (\frac{\theta _0}{2}) \left<\mathbf{x},\,\hat{\mathbf{w}} \right>\!\right)\! . \end{aligned}$$

Now, we define \(f(\mathtt{{R}}) = d^p(\mathtt{{R}}, \mathtt{{S}}) = g(d_{\angle }(\mathtt{{R}}, \mathtt{{S}})) = g(\theta )\), for some function \(g\). The various metrics being considered can all be expressed in this way for suitable functions \(g\) (see Table 2). Taking first derivatives using the chain rule gives

$$\begin{aligned} \frac{\partial \tilde{f}}{\partial x_i} = \frac{\partial g}{\partial \theta } ~ \frac{\partial \theta }{\partial x_i} \quad \text{ or} \quad \nabla _f = g^{\prime }(\theta _0) \nabla _{\theta }. \end{aligned}$$

Evaluating at the point \(\mathbf{x} = 0\) gives the gradient

$$\begin{aligned} \nabla _f = -g^{\prime }(\theta _0) \hat{\mathbf{w}}. \end{aligned}$$

In interpreting this, note that \(\mathtt{{R}} \exp [t \hat{\mathbf{w}}]_\times = \exp _{\mathtt{{R}}} [t \hat{\mathbf{w}}]_\times \) is a geodesic from \(\mathtt{{R}}\) when \(t=0\) to \(\mathtt{{S}}\) when \(t=1\). Thus, as a vector in the tangent space at \(\mathtt{{R}}\), the unit vector \(\hat{\mathbf{w}}\) may be viewed as the direction from \(\mathtt{{R}}\) to \(\mathtt{{S}}\). The gradient points directly away from \(\mathtt{{S}}\), in the direction of greatest increasing distance.

Similarly, taking second derivatives using the chain and product rules leads to

$$\begin{aligned} \frac{\partial ^2 \tilde{f}}{\partial x_i \partial x_j} = \frac{\partial ^2 g}{\partial \theta ^2} ~ \frac{\partial \theta }{\partial x_i} ~ \frac{\partial \theta }{\partial x_j} + \frac{\partial g}{\partial \theta } ~ \frac{\partial ^2 \theta }{\partial x_i\partial x_j} \end{aligned}$$

or

$$\begin{aligned} \mathtt{{H}}_f =g^{\prime \prime }(\theta _0) \nabla _{\theta } \nabla _{\theta }^{\top }+ g^{\prime }(\theta _0) \mathtt{{H}}_\theta . \end{aligned}$$

From this it is straight-forward to compute the Hessian. The result is

$$\begin{aligned} \mathtt{{H}}_f = g^{\prime \prime }(\theta _0) \, \hat{\mathbf{w}}_i \hat{\mathbf{w}}_i^{\top }+ g^{\prime }(\theta _0) \frac{\cot (\theta _0/2)}{2} \, (\mathtt{{I}} - \hat{\mathbf{w}}_i \hat{\mathbf{w}}_i^{\top }). \end{aligned}$$

Note that both \(\hat{\mathbf{w}}_i \hat{\mathbf{w}}_i^{\top }\) and \(\mathtt{{I}} - \hat{\mathbf{w}}_i \hat{\mathbf{w}}_i^{\top }\) can be diagonalized simultaneously to \(\mathrm{diag}(1, 0, 0)\) and \(\mathrm{diag}(0, 1, 1)\). Thus, the Hessian may be transformed orthogonally (but differently for each \(i\)) to the form

$$\begin{aligned} \mathtt{{H}}_f \approx g^{\prime \prime }(\theta _0) \mathrm{diag}(1, 0, 0) + g^{\prime }(\theta _0) \frac{\cot (\theta _0/2)}{2} \, \mathrm{diag}(0, 1, 1). \end{aligned}$$

In particular, the Hessian is positive definite exactly when both the derivatives of \(g\) are positive. We can apply this formula with different functions \(g\) to obtain the results in Table 3.

Conjugate Distance Function. Given rotations \(\mathtt{{R}}_i\) and \(\mathtt{{L}}_i\), we consider the function \(\mathtt{{S}} \mapsto d^p(\mathtt{{R}}_i \mathtt{{S}}, \mathtt{{S}} \mathtt{{L}}_i)\). We wish to compute the gradient and Hessian of this function. For simplicity, we will compute these quantities at the point \(\mathtt{{S}} = \mathtt{{I}}\), and see later that the general case is easily derived from this special case. Setting \(\mathtt{{S}} = \exp [\mathbf{x}]_\times \), the gradient and Hessian are defined as the gradient and Hessian of \(d^p (\mathtt{{R}}_i \exp [\mathbf{x}]_\times , \exp [\mathbf{x}]_\times \mathtt{{L}}_i)\) with respect to the vector \(\mathbf{x}\).

Let \(\mathbf{r}_i, \mathbf{l}_i\) and \(\mathbf{s}\) be corresponding quaternion representations, chosen to lie in the upper quaternion hemisphere. Let \(\theta _i = d_{\angle }(\mathtt{{R}}_i \mathtt{{S}}, \mathtt{{S}} \mathtt{{L}}_i)\) and define \(C = \cos (\theta _i/2)\). Then, \(C\) may be written in terms of the quaternion inner product

$$\begin{aligned} C = \left<\mathbf{r}_i \cdot \mathbf{s},\,\mathbf{s} \cdot \mathbf{l}_i \right>. \end{aligned}$$

Let the quaternion representations of \(\mathtt{{R}}_i\) and \(\mathtt{{L}}_i\) be \(\mathbf{r}_i = (r_0, \mathbf{r}_i^{\prime })\) and \(\mathbf{l}_i = (l_0, \mathbf{l}_i^{\prime })\). The quaternion representation of \(\mathtt{{S}} = \exp [\mathbf{x}]_\times \) is \((\cos (\Vert \mathbf{x}\Vert /2), \sin (\Vert \mathbf{x}\Vert /2) \mathbf{x} / \Vert \mathbf{x} \Vert )\), which, as above, we may replace by its second-order approximation \(\mathbf{s} = (1 - \Vert \mathbf{x}\Vert ^2/8, \mathbf{x}/2)\). Now, we may compute the inner product \(C = \left<\mathbf{r}_i\cdot \mathbf{s},\,\mathbf{s}\cdot \mathbf{l}_i \right>\), and differentiate with respect to \(\mathbf{x}\). The results for the gradient and Hessian of \(C\) are

$$\begin{aligned} \nabla _C = \mathbf{l}^{\prime }_i \times \mathbf{r}^{\prime }_i , \end{aligned}$$
(30)

and

$$\begin{aligned} \mathtt{{H}}_C = (\mathbf{l}^{\prime }_i {\mathbf{r}^{\prime }_i}{^{\top }} + {\mathbf{r}^{\prime }_i}{\mathbf{l}^{\prime }_i}{^{\top }})/2 - \left<\mathbf{l}^{\prime }_i,\,\mathbf{r}^{\prime }_i \right> \mathtt{{I}}. \end{aligned}$$
(31)

Note that \(\mathbf{r}_i^{\prime }\) and \(\mathbf{l}_i^{\prime }\) are vectors of length \(\sin (\theta _i^r/2)\) and \(\sin (\theta _i^l/2)\), where \(\theta _i^r\) and \(\theta _i^l\) are the respective rotation angles of \(\mathtt{{R}}_i\) and \(\mathtt{{L}}_i\). Hence, the above formulas may easily be rewritten in terms of the unit rotation axes of the rotations, by multiplying by weights \(w_i = \sin (\theta _i^r/2)\) resp. \(\sin (\theta _i^l/2)\). The eigenvalues of \(\mathtt{{H}}_C\) may be easily computed, and expressed in the form \((w_i \cos (\alpha _i/2), w_i (\cos (\alpha _i) - 1), w_i (\cos (\alpha _i) + 1))\) where \(\alpha _i\) is the angle between the axes of \(\mathtt{{R}}_i\) and \(\mathtt{{L}}_i\). Hence, the Hessian has at least one negative eigenvalue, unless \(\alpha _i = 0\), when it has two positive and one zero eigenvalue.

Let \(d^p(\mathtt{{R}}_i \mathtt{{S}}, \mathtt{{S}}\mathtt{{L}}_i)\) be written as \(g(C)\) for some appropriate function \(g\). For example, since \(C = \cos (\theta /2)\), we have \(d_\mathrm{quat}(\cdot , \cdot )^2 = 4 \sin ^2(\theta /4) = 2(1 - C)\) and \(d_\mathrm{chord}(\cdot , \cdot )^2 = 8 \sin ^2(\theta /2) = 8 (1 - C^2)\). The gradient and Hessian may then be expressed as in Table 4.

Table 4 Hessians and gradient of the conjugate cost function \(f(\mathtt{{S}}) = d^p(\mathtt{{R}}\mathtt{{S}}, \mathtt{{S}} \mathtt{{L}})\), evaluated at \(\mathtt{{S}} = \mathtt{{I}}\)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hartley, R., Trumpf, J., Dai, Y. et al. Rotation Averaging. Int J Comput Vis 103, 267–305 (2013). https://doi.org/10.1007/s11263-012-0601-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-012-0601-0

Keywords

Navigation