Skip to main content
Log in

Distances and Means of Direct Similarities

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The non-Euclidean nature of direct isometries in a Euclidean space, i.e. transformations consisting of a rotation and a translation, creates difficulties when computing distances, means and distributions over them, which have been well studied in the literature. Direct similarities, transformations consisting of a direct isometry and a positive uniform scaling, present even more of a challenge—one which we demonstrate and address here. In this article, we investigate divergences (a superset of distances without constraints on symmetry and sub-additivity) for comparing direct similarities, and means induced by them via minimizing a sum of squared divergences. We analyze several standard divergences: the Euclidean distance using the matrix representation of direct similarities, a divergence from Lie group theory, and the family of all left-invariant distances derived from Riemannian geometry. We derive their properties and those of their induced means, highlighting several shortcomings. In addition, we introduce a novel family of left-invariant divergences, called SRT divergences, which resolve several issues associated with the standard divergences. In our evaluation we empirically demonstrate the derived properties of the divergences and means, both qualitatively and quantitatively, on synthetic data. Finally, we compare the divergences in a real-world application: vote-based, scale-invariant object recognition. Our results show that the new divergences presented here, and their means, are both more effective and faster to compute for this task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Divergences are a superset of distances, defined in §2.1.1.

  2. Bias is defined in §2.1.2.

  3. Preliminary work from this article appears in (Pham et al. 2011), where contributions 1, 3, 5 and 6 are briefly reported. In addition to more detail and experiments here, we further introduce: i. closed-form Euclidean means and closed-form Lie divergences; ii. proofs that the Lie divergence and all left-invariant distances induce biased means; iii. an extension of the SRT divergence (Pham et al. 2011) to a family of SRT divergences with closed-form means.

  4. The Lie divergence in \(\mathcal {{DS}}(n)\) is first discussed in (Pham et al. 2011), where it is called the intrinsic distance.

  5. There is also a right-Lie exponential map, \(\mathrm {Exp}_{\mathbf {{X}}}(\mathbf {{Z}})=e^{\mathbf {{Z}}}\mathbf {{X}}\). However, the resulting divergence, \(d_{\mathrm {L}}\left( \mathbf {{X}},\mathbf {{Y}}\right) =\left\| \mathrm {ln}(\mathbf {{Y}}\mathbf {{X}}^{-1})\right\| _{\mathrm {F}}\), would be right-invariant, not left-invariant.

  6. There is a similar divergence in the literature called the Log-Euclidean distance, defined by \(d_{\mathrm {LE}}(\mathbf {{X}},\mathbf {{Y}}):=\left\| \mathrm {ln}\mathbf {{X}}-\mathrm {ln}\mathbf {{Y}}\right\| _{\mathrm {F}}\) in the space of symmetric positive-definite matrices (Arsigny et al. 2006a). This divergence, if applied to \(\mathcal {{DS}}(n)\), becomes inverse-invariant but not left-invariant.

  7. We initialize the algorithm at one of the input direct similarities, the highest weighted one if computing a weighted mean.

  8. These formulæ are derived from the partial derivatives of the matrix exponential in \(\mathcal {{SO}}(n),\) given in Appendix Sect. “Rotation Group \(\mathcal {{SO}}(2)\)”.

  9. As an example, a great circle is totally geodesic in a sphere.

  10. i.e. A line element is an infinitesimal arc-length.

  11. For example, consider \(\sigma _{s}=\sigma _{r}=\sigma _{t}=1\), \(\mathbf {{A}}:= m \left( e^{-10}\varvec{{I}}_{3},\varvec{{\hat{e}}}_{1}\right) \), \(\mathbf {{B}}:= m \left( e^{10}\varvec{{I}}_{3},\varvec{{\hat{e}}}_{1}\right) \), and \(\mathbf {{C}}:= m \left( \varvec{{I}}_{3},\mathbf {{0}}\right) \) in \(\mathcal {{DS}}(3)\). For any \(\alpha \ge 0\): \(d_{\alpha }(\mathbf {{A}},\mathbf {{B}})=20\), \(d_{\alpha }(\mathbf {{B}},\mathbf {{C}})=\sqrt{10^{2}+e^{-10(1+\alpha )}}\) and \(d_{\alpha }(\mathbf {{A}},\mathbf {{C}})=\sqrt{10^{2}+e^{10(1+\alpha )}}\), proving that \(d_{\alpha }(\mathbf {{A}},\mathbf {{C}})>d_{\alpha }(\mathbf {{A}},\mathbf {{B}})+d_{\alpha }(\mathbf {{B}},\mathbf {{C}})\).

  12. We set \(\hat{\mathbf {q}}\) to one of the input quaternions (that with the highest weight, if computing a weighted mean), or in the case of mean shift, the mean from the previous iteration of mean shift. The evaluation of Eq. (62) could be repeated, replacing \(\hat{\mathbf {q}}\) with \(\bar{\mathbf {{q}}}\) each iteration, until convergence, which is rapid as \(\hat{\mathbf {q}}\) affects only the sign applied to each input quaternion, but in practice we used just one iteration.

  13. Given 40 training sets, only 41 different registration scores are achievable.

References

  • Agrawal, M. (2006). A Lie algebraic approach for consistent pose registration for general euclidean motion. In: Proceedings of the international conference on intelligent Robot and systems (pp. 1891–1897).

  • Arnaudon, M., & Miclo, L. (2014). Means in complete manifolds: Uniqueness and approximation. ESAIM: Probability and Statistics, 18, 185–206.

    Article  MATH  MathSciNet  Google Scholar 

  • Arnold, V., Vogtmann, K., & Weinstein, A. (1989). Mathematical methods of classical mechanics. Graduate Texts in Mathematics. Springer.

  • Arsigny, V., Commowick, O., Pennec, X., & Ayache, N. (2006a). A Log-Euclidean polyaffine framework for locally rigid or affine registration. In: Biomedical image registration (Vol. 4057, pp 120–127).

  • Arsigny, V., Pennec, X., & Ayache, N. (2006b). Bi-invariant means in lie groups. Applications to left-invariant polyaffine transformations. Tech. rep., INRIA Technical Report No. 5885.

  • Begelfor, E., & Werman, M. (2006). Affine invariance revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 2087–2094), Washington, DC, USA: IEEE Computer Society.

  • Beltrami, E. (1868). Teoria fondamentale degli spazi di curvatura constante. Annali di Mat, II(2), 232–255.

  • Bhattacharya, R., & Patrangenaru, V. (2003). Large sample theory of intrinsic and extrinsic sample means on manifolds. The Annals of Statistics, 31(1), 1–29.

    Article  MATH  MathSciNet  Google Scholar 

  • Bossa, M. N., & Olmos, S. (2006). Statistical model of similarity transformations: Building a multi-object pose model of brain structures. In: Workshop on mathematical methods in biomedical image analysis.

  • Carreira Perpinan, M. (2007). Gaussian mean-shift is an EM algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5), 767–776.

    Article  Google Scholar 

  • Cetingul, H., & Vidal, R. (2009). Intrinsic mean shift for clustering on Stiefel and Grassmann manifolds. In: Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1896–1902).

  • Cheng, S. H., Higham, N. J., Kenney, C. S., & Laub, A. J. (2000). Approximating the logarithm of a matrix to specified accuracy. SIAM Journal on Matrix Analysis and Applications, 22, 1112–1125.

    Article  MathSciNet  Google Scholar 

  • Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 790–799.

    Article  Google Scholar 

  • Coxeter, H. S. M. (1961). Introduction to geometry. New York: Wiley.

    MATH  Google Scholar 

  • Downs, T. (1972). Orientation statistics. Biometrika, 59, 665–676.

    Article  MATH  MathSciNet  Google Scholar 

  • Drost, B., Ulrich, M., Navab, N., & Ilic, S. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition (pp. 998–1005).

  • Dubbelman, G., Dorst, L., & Pijls, H. (2012). Manifold statistics for essential matrices. In: Proceedings of European conference on computer vision (pp. 531–544). Lecture Notes in Computer Science. Berlin: Springer.

  • Eade, E. (2011). Lie groups for 2d and 3d transformations, http://ethaneade.com/lie.pdf, revised Dec. 2013.

  • Fréchet, M. (1948). Les lments alatoires de nature quelconque dans un espace distanci. Annales de l’Institut Henri Poincare, 10, 215–310.

    Google Scholar 

  • Gallier, J., & Xu, D. (2002). Computing exponentials of Skew-Symmetric matrices and logarithms of orthogonal matrices. International Journal of Robotics and Automation, 17(4), 10–20.

    Google Scholar 

  • Hall, B. C. (2003). Lie groups, lie algebras, and representations: An elementary introduction. Berlin: Springer.

    Book  Google Scholar 

  • Hartley, R., Trumpf, J., Dai, Y., & Li, H. (2013). Rotation averaging. International Journal of Computer Vision, 103(3), 267–305.

  • Hartley, R. I., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14, 1771–1800.

    Article  MATH  Google Scholar 

  • Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Communications on Pure and Applied Mathematics, 30(5), 509–541.

    Article  MATH  MathSciNet  Google Scholar 

  • Khoshelham, K. (2007). Extending generalized Hough transform to detect 3D objects in laser range data. Workshop on Laser Scanning, XXXVI, 206–210.

  • Knopp, J., Prasad, M., Willems, G., Timofte, R., & Van Gool, L. (2010). Hough transform and 3D SURF for robust three dimensional classification. In: Proceedings of European conference on computer vision (pp. 589–602).

  • Lee, J. (1997). Riemannian manifolds: An introduction to curvature. Graduate Texts in Mathematics. Springer.

  • Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.

    Article  Google Scholar 

  • Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45, 1989.

    Article  MathSciNet  Google Scholar 

  • Moakher, M. (2002). Means and averaging in the group of rotations. SIAM Journal on Matrix Analysis and Applications, 24, 1–16.

    Article  MATH  MathSciNet  Google Scholar 

  • O’Neill, B. (1983). Semi-Riemannian geometry: With applications to relativity. No. v. 103 in pure and applied mathematics. Academic Press.

  • Opelt, A., Pinz, A., & Zisserman, A. (2008). Learning an alphabet of shape and appearance for multi-class object detection. International Journal of Computer Vision, 80(1).

  • Park, F. C. (1995). Distance metrics on the rigid-body motions with applications to mechanism design. Journal of Mechanical Design, 117(1), 48–54.

    Article  Google Scholar 

  • Park, F. C., & Ravani, B. (1997). Smooth invariant interpolation of rotations. ACM Transactions on Graphics, 16(3), 277–295.

    Article  Google Scholar 

  • Parzen, E. (1962). 1962. The Annals of Mathematical Statistics, 33(3), 1065–1076.

  • Pelletier, B. (2005). Kernel density estimation on Riemannian manifolds. Statistics Probability Letters, 73(3), 297–304.

    Article  MATH  MathSciNet  Google Scholar 

  • Pennec, X. (1998). Computing the mean of geometric features application to the mean rotation. Tech. Rep. RR-3371, INRIA.

  • Pennec, X. (2006). Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. JMIV, 25(1), 127–154.

    Article  MathSciNet  Google Scholar 

  • Pennec, X., & Ayache, N. (1998). Uniform distribution, distance and expectation problems for geometric features processing. Journal of Mathematical Imaging and Vision, 9, 49–67.

    Article  MATH  MathSciNet  Google Scholar 

  • Pennec, X., & Thirion, J. P. (1997). A framework for uncertainty and validation of 3D registration methods based on points and frames. International Journal of Computer Vision, 25(3), 203–229.

    Article  Google Scholar 

  • Pham, M. T., Woodford, O. J., Perbet, F., Maki, A., Stenger, B., & Cipolla, R. (2011). A new distance for scale-invariant 3D shape recognition and registration. In: Proceedings of the international conference on computer vision.

  • Pham, M. T., Woodford, O. J., Perbet, F., Maki, A., & Stenger, B. (2012). Toshiba CAD model point clouds dataset. http://www.toshiba.eu/eu/Cambridge-Research-Laboratory/Computer-Vision-Group/Stereo-Points/

  • Poincaré, H. (1882). Théorie des groupes fuchsiens. Almqvist & Wiksells.

  • Ravani, B., & Roth, B. (1983). Motion synthesis using kinematic mappings. Journal of Mechanical Design, 105(3), 460–467.

    Google Scholar 

  • Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics, 27(3), 832–837.

    Article  MATH  MathSciNet  Google Scholar 

  • Schönemann, P. (1966). A generalized solution of the orthogonal procrustes problem. Psychometrika, 31(1), 1–10.

    Article  MATH  MathSciNet  Google Scholar 

  • Schramm, E., & Schreck, P. (2003). Solving geometric constraints invariant modulo the similarity group. In: International conference on computational science and its applications (pp. 356–365).

  • Shotton, J., Blake, A., & Cipolla, R. (2008). Multiscale categorical object recognition using contour fragments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7), 1270–1281.

    Article  Google Scholar 

  • Sibson, R. (1979). Studies in the robustness of multidimensional scaling: Perturbational analysis of classical scaling. Journal of the Royal Statistical Society Series B, 41(2), 217–229.

    MATH  MathSciNet  Google Scholar 

  • Sternberg, S. (1999). Lectures on differential geometry. AMS Chelsea Publishing Series. Chelsea Publishing Company

  • Strasdat, H., Montiel, J., & Davison, A. J. (2010). Scale drift-aware large scale monocular slam. Robotics: Science and Systems., 2(3), 5.

    Google Scholar 

  • Subbarao, R., & Meer, P. (2009). Nonlinear mean shift over Riemannian manifolds. International Journal of Computer Vision, 84(1).

  • Tombari, F., & Di Stefano, L. (2010). Object recognition in 3D scenes with occlusions and clutter by Hough voting. In: Proceedings of Pacific-Rim symposium on image and video technology (pp. 349–355).

  • Vaccaro, C. (2012). Heat kernel methods in finance: The SABR model. Quantitative Finance Papers. http://arxiv.org/ftp/arxiv/papers/1201/1201.1437.pdf

  • Woodford, O. J., Pham, M. T., Maki, A., Perbet, F., & Stenger, B. (2013). Demisting the Hough transform for 3D shape recognition and registration. In: International Journal of Computer Vision.

  • Zefran, M., & Kumar, V. (1998). Interpolation schemes for rigid body motions. Computer-Aided Design, 30(3), 179–189.

    Article  MATH  Google Scholar 

Download references

Acknowledgments

We are thankful to Peter Meer, Department of Electrical and Computer Engineering, Rudgers University for his valuable feedback on the analysis of Riemannian distances of the article, and to all three reviewers for their insightful comments and suggestions that helped us to improve the article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minh-Tri Pham.

Additional information

Communicated by Jean-Michel Morel.

Appendix: Proofs

Appendix: Proofs

1.1 Rotation Group \(\mathcal {{SO}}(n)\)

The rotation group \(\mathcal {{SO}}(n)\) is the group of \(n\)-dimensional rotation matrices:

$$\begin{aligned} \mathcal {{SO}}(n)=\left\{ \mathbf {{R}}\in GL(n):\mathbf {{R}}^{\mathrm {T}}\mathbf {{R}}=\varvec{{I}}_{n}\wedge \det (\mathbf {{R}})=1\right\} . \end{aligned}$$
(67)

A number of known facts related to \(\mathcal {{SO}}(n)\) are required in the proofs. They are summarized here.

Because rotation preserves the Euclidean norm, the eigenvalues of a rotation matrix \(\mathbf {{R}}\) are unit complex numbers \(e^{i\theta _{k}}\), for \(\theta _{k}\in \mathbb {R}\) and \(k=1,\ldots ,n\). Since \(\mathbf {{R}}\) is a real matrix, both \(e^{i\theta _{k}}\) and \(e^{-i\theta _{k}}\) are eigenvalues of \(\mathbf {{R}}\). The Lie algebra of \(\mathcal {{SO}}(n)\), denoted by \(\mathfrak {{so}}(n)\), contains skew-symmetric matrices

$$\begin{aligned} \mathfrak {{so}}(n)=\left\{ \mathbf {\mathbf {{W}}}\in GL(n):\mathbf {{W}}^{\mathrm {T}}=-\mathbf {{W}}\right\} . \end{aligned}$$
(68)

The complex eigenvalues of \(\mathbf {{W}}\) are 0 and complex conjugate pairs \(\pm i\theta _{k}\) (i.e. logarithms of the eigenvalues of \(\mathbf {{R}}\)). The matrix exponential series in (4) and its inverse in (16) send points back and forth between \(\mathcal {{SO}}(n)\) and \(\mathfrak {{so}}(n)\). However, (16) converges only when \(\left| \theta _{k}\right| <\pi \) for all \(k\).

The space of skew-symmetric matrices is isomorphic to the space of bivectors in geometric algebra, which is a vector space. If we define a basis for the space, any skew-symmetric matrix can be represented compactly as a vector. Denote by \(\varvec{{\hat{e}}}_{i}\) a single-entry unit vector having a 1 at row \(i\) and 0 elsewhere, with size matching the context. Let single-entry matrices \(\varvec{{\hat{E}}}_{i,j}\) be defined as \(\varvec{{\hat{E}}}_{i,j}:=\varvec{{\hat{e}}}_{i}\varvec{{\hat{e}}}_{j}^{\mathrm {T}}\). Consider \(n_{r}:=\frac{n(n-1)}{2}\) \(n\)-by-\(n\) matrices \((\varvec{{B}}_{k})_{k=1}^{n_{r}}\):

$$\begin{aligned} \varvec{{B}}_{k}:=(-1)^{i+j}(\varvec{{\hat{E}}}_{i,j}-\varvec{{\hat{E}}}_{j,i}), \end{aligned}$$
(69)

where variables \(k\) and \(i,j\) are related by \(k=n_{r}+2-j-(n-1-i/2)(i-1)\) with \(1\le i<j\le n\). Then, any skew-symmetric matrix \(\mathbf {{W}}\in \mathfrak {{so}}(n)\) is uniquely represented as \(\mathbf {{W}}=\mathbf {{x}}_{\underline{k}}\varvec{{B}}_{\underline{k}}\) for some \(\mathbf {{x}}\in \mathbb {R}^{n_{r}}\). Hereinafter, \(\mathbf {{W}}_{\times }:=\mathbf {{x}}\) denotes the vector representation of skew-symmetric matrix \(\mathbf {{W}}\) via this map. Conversely, given a vector \(\mathbf {{x}}\in \mathbb {R}^{n_{r}}\), \(\mathbf {{x}}^{\times }\) denotes the skew-symmetric matrix \(\mathbf {{x}}_{\underline{k}}\varvec{{B}}_{\underline{k}}\). In fact, when \(n=3\), \(\mathbf {{x}}^{\times }\) corresponds to the matrix representation of the cross product, i.e. for any \(\mathbf {{x}},\mathbf {{y}}\in \mathbb {R}^{3}\), \(\mathbf {{x}}^{\times }\mathbf {{y}}=\mathbf {{x}}\times \mathbf {{y}}\).

Consider the exponential map \(\mathbf {{x}}\rightarrow \mathrm {e}^{\mathbf {{x}}^{\times }}\) that maps points in \(\mathbb {R}^{n_{r}}\) to \(\mathcal {{SO}}(n)\). Since \(\frac{\partial \mathbf {{x}}^{\times }}{\partial \mathbf {{x}}_{k}}=\varvec{{B}}_{k}\), the partial derivatives of the function \(\mathrm {e}^{\mathbf {{x}}^{\times }}\) are given by:

$$\begin{aligned} \frac{\partial \mathrm {e}^{\mathbf {{x}}^{\times }}}{\partial \mathbf {{x}}_{k}}&= \mathrm {e}^{\mathbf {{x}}^{\times }}\Psi _{\mathbf {{x}}}(\varvec{{B}}_{k}), \end{aligned}$$
(70)

where \(\Psi _{\mathbf {{x}}}:\mathfrak {{so}}(n)\rightarrow \mathfrak {{so}}(n)\) satisfies (Hall 2003, theorem 3.5):

$$\begin{aligned} \Psi _{\mathbf {{x}}}(\mathbf {{V}}):=\sum _{k=0}^{\infty }\frac{\mathrm {ad}_{-\mathbf {{x}}^{\times }}^{k}(\mathbf {{V}})}{(k+1)!}, \end{aligned}$$
(71)

where \(\mathrm {ad}:\mathfrak {{so}}(n)\times \mathfrak {{so}}(n)\rightarrow \mathfrak {{so}}(n)\) is a Lie algebra automorphism given by \(\mathrm {ad}_{\mathbf {{U}}}(\mathbf {{V}})=[\mathbf {{U}},\mathbf {{V}}]:=\mathbf {{U}}\mathbf {{V}}-\mathbf {{V}}\mathbf {{U}}\) and \([\cdot ,\cdot ]\) is called the commutator. We further denote by \(\tilde{\Psi }_{\mathbf {{x}}}\) the matrix that represents \(\Psi _{\mathbf {{x}}}(\cdot )\) in local coordinates, i.e. for all \(\mathbf {{v}}\in \mathbb {R}^{n_{r}}\),

$$\begin{aligned} \tilde{\Psi }_{\mathbf {{x}}}\mathbf {{v}}=\Psi _{\mathbf {{x}}}(\mathbf {{v}}^{\times })_{\times }. \end{aligned}$$
(72)

Deriving an efficient form \(\Psi _{\mathbf {{x}}}\) (and \(\tilde{\Psi }_{\mathbf {{x}}}\) equivalently) in the general case \(\mathcal {{SO}}(n)\) is not straightforward, which is to be addressed elsewhere. However, in case of \(\mathcal {{SO}}(2)\) and \(\mathcal {{SO}}(3)\), we can derive it by analyzing the commutator in what follows.

1.1.1 Rotation Group \(\mathcal {{SO}}(2)\)

The 2D rotation group is one-dimensional. Every \(\mathbf {{W}}\in \mathfrak {{so}}(2)\) is written uniquely as \(\theta \varvec{{B}}_{1}\) for some \(\theta \in \mathbb {R}\). Thus, every rotation \(\mathbf {{R}}\in \mathcal {{SO}}(2)\) equals \(\mathrm {e}^{\theta \varvec{{B}}_{1}}\), and the commutator \([\cdot ,\cdot ]\) always vanishes. As a result, \(\Psi _{\mathbf {{x}}}(\mathbf {{V}})=\mathbf {{V}}\), leading to \(\tilde{\Psi }_{\mathbf {{x}}}=\varvec{{I}}_{n_{r}}\).

1.1.2 Rotation Group \(\mathcal {{SO}}(3)\)

The 3D rotation group is apparently more complicated. It has three dimensions. The eigenvalues of a rotation matrix \(\mathbf {{R}}\in \mathcal {{SO}}(3)\) are \(\{1,e^{i\theta },e^{-i\theta }\}\), where \(\mathrm {trace}\mathbf {{R}}=2\cos \theta +1\). Like in \(\mathcal {{SO}}(2)\), \(\mathrm {ln}\mathbf {{R}}\) converges when \(\left| \theta \right| <\pi \), in which case \(\ln \mathbf {{R}}\) is reduced to \(\frac{0.5\theta }{\sin \theta }\left( \mathbf {{R}}-\mathbf {{R}}^{\mathrm {T}}\right) \). If instead \(\mathbf {{W}}\in \mathfrak {{so}}(3)\) is given, the eigenvalues of \(\mathbf {{W}}\) are \(\{0,i\theta ,-i\theta \}\) where \(\theta =\left\| \mathbf {{W}}\right\| _{\mathrm {F}}/\sqrt{2}\), and \(\mathrm {e}^{\mathbf {{W}}}\) is derived by Rodrigues’ formula in (31).

According to the above-mentioned basis in \(\mathfrak {{so}}(n)\), the three basis tangent vectors in \(\mathfrak {{so}}(3)\) are \(\varvec{{B}}_{1}=\varvec{{\hat{E}}}_{3,2}-\varvec{{\hat{E}}}_{2,3}\), \(\varvec{{B}}_{2}=\varvec{{\hat{E}}}_{1,3}-\varvec{{\hat{E}}}_{3,1}\) and \(\varvec{{B}}_{3}=\varvec{{\hat{E}}}_{2,1}-\varvec{{\hat{E}}}_{1,2}\). They satisfy: \([\varvec{{B}}_{1},\varvec{{B}}_{2}]=\varvec{{B}}_{3}\), \([\varvec{{B}}_{2},\varvec{{B}}_{3}]=\varvec{{B}}_{1}\) and \([\varvec{{B}}_{3},\varvec{{B}}_{1}]=\varvec{{B}}_{2}\). Based on these equations, we derive a closed form for computing the commutator in local coordinates:

$$\begin{aligned}{}[\mathbf {{x}}^{\times },\mathbf {{y}}^{\times }]_{\times }=\mathbf {{x}}^{\times }\mathbf {{y}}= \mathbf {{x}}\times \mathbf {{y}}. \end{aligned}$$
(73)

Since \(\mathrm {ad}_{-\mathbf {{x}}^{\times }}(\mathbf {{V}})=[-\mathbf {{x}}^{\times },\mathbf {{V}}]=[\mathbf {{V}},\mathbf {{x}}^{\times }]\) for all skew-symmetric matrices \(\mathbf {{V}}\), and \((\mathbf {{x}}^{\times })^{3}=-\left\| \mathbf {{x}}\right\| ^{2}\mathbf {{x}}^{\times }\), applying (73) to (71), we obtain for all \(\mathbf {{v}}\in \mathbb {R}^{3}\):

$$\begin{aligned} \Psi _{\mathbf {{x}}}(\mathbf {{v}}^{\times })_{\times }=\mathbf {{v}}-\mathbf {{x}}^{\times }\mathbf {{v}}\frac{1-\cos \theta }{\theta ^{2}}+(\mathbf {{x}}^{\times })^{2}\mathbf {{v}}\frac{\theta -\sin \theta }{\theta ^{3}}. \end{aligned}$$
(74)

Since by definition, \(\tilde{\Psi }_{\mathbf {{x}}}\mathbf {{v}}=\Psi _{\mathbf {{x}}}(\mathbf {{v}}^{\times })_{\times }\), this leads to:

$$\begin{aligned} \tilde{\Psi }_{\mathbf {{x}}}=\varvec{{I}}_{3}-\mathbf {{x}}^{\times }\frac{1-\cos \theta }{\theta ^{2}} +(\mathbf {{x}}^{\times })^{2}\frac{\theta -\sin \theta }{\theta ^{3}}. \end{aligned}$$
(75)

We note that \(\tilde{\Psi }_{\mathbf {{x}}}^{\mathrm {T}}=\tilde{\Psi }_{-\mathbf {{x}}}\) and:

$$\begin{aligned} \tilde{\Psi }_{\mathbf {{x}}}^{\mathrm {T}}\tilde{\Psi }_{\mathbf {{x}}}=\varvec{{I}}_{3}+(\mathbf {{x}}^{\times })^{2} \frac{\theta ^{2}+2\cos \theta -2}{\theta ^{4}}. \end{aligned}$$
(76)

1.2 Proofs for Lemma 1

The objective function to minimize in (6) is rewritten as:

$$\begin{aligned} \mathcal {E}(\mathbf {{X}})&= \mathsf {w}_{\underline{i}}d_{\mathrm {E}}(\varvec{{X}}_{\underline{i}},\mathbf {{X}})^{2}\nonumber \\&= \mathsf {w}_{\underline{i}}\left\| \varvec{{X}}_{\underline{i};s} \varvec{{X}}_{\underline{i};r}-\mathbf {{X}}_{s}\mathbf {{X}}_{r}\right\| _{\mathrm {F}}^{2}+ \mathsf {w}_{\underline{j}}\left\| \varvec{{X}}_{\underline{j};t}-\mathbf {{X}}_{t}\right\| ^{2}. \end{aligned}$$
(77)

Minimizing the second term with respect to \(\mathbf {{X}}_{t}\) gives (12). Hence, the problem becomes finding \(\mathbf {{X}}_{s}\) and \(\mathbf {{X}}_{r}\) that minimize the first term. To do this, we define \(\mathsf {v}_{i}:=\mathsf {w}_{i}\mathbf {{X}}_{s}^{2}\) and \(\varvec{{V}}_{i}:=\varvec{{X}}_{i;s}\varvec{{X}}_{i;r}/\mathbf {{X}}_{s}\) for all \(i=1,\ldots ,N\), and rewrite the first term as:

$$\begin{aligned} E'(\mathbf {{X}}):=\mathsf {v}_{\underline{i}}\left\| \varvec{{V}}_{\underline{i}}-\mathbf {{X}}_{r}\right\| _{\mathrm {F}}^{2}. \end{aligned}$$
(78)

The idea is to find \(\mathbf {{X}}_{r}\) that minimizes \(E'(\mathbf {{X}})\) given \(\mathbf {{X}}_{s}\) first, and then to use the resultant formula to find \(\mathbf {{X}}_{s}\). Since \(\mathbf {{X}}_{r}\) is a rotation matrix, minimizing \(E'(\mathbf {{X}})\) with respect to \(\mathbf {{X}}_{r}\) has been solved in (Downs 1972; Sibson 1979). It is analogous to the classical orthogonal Procrustes problem which seeks the orthogonal matrix that most closely transforms a given matrix to a second one. The minimizer for \(\mathsf {v}_{\underline{i}}\left\| \varvec{{V}}_{\underline{i}}-\mathbf {{X}}_{r}\right\| _{\mathrm {F}}^{2}\) is given by:

$$\begin{aligned} \mathop {\hbox {argmax}}\limits _{\mathbf {{X}}_{r}\in \mathcal {{SO}}(n)}\mathsf {v}_{\underline{i}}\left\| \varvec{{V}}_{\underline{i}}-\mathbf {{X}}_{r}\right\| _{\mathrm {F}}^{2}=\mathrm {sop}(\mathsf {v}_{\underline{j}}\varvec{{V}}_{\underline{j}}). \end{aligned}$$
(79)

Function \(\mathrm {sop}(\cdot )\) is invariant to direct dilation, i.e. \(\mathrm {sop}(\mathbf {{X}})=\mathrm {sop}(a\mathbf {{X}})\) for any \(a>0\) (Downs 1972; Sibson 1979). This gives us a formula for finding \(\overline{\mathbf {{X}}}_{r}\) independently from \(\overline{\mathbf {{X}}}_{s}\):

$$\begin{aligned} \overline{\mathbf {{X}}}_{r}&= \mathrm {sop}(\mathsf {v}_{\underline{i}}\varvec{{V}}_{\underline{i}})=\mathrm {sop}(\mathsf {w}_{\underline{i}}\mathbf {{X}}_{s}^{2} \varvec{{X}}_{\underline{i};s}\varvec{{X}}_{\underline{i};r}/\mathbf {{X}}_{s})\nonumber \\&= \mathrm {sop}(\mathsf {w}_{\underline{i}}\varvec{{X}}_{\underline{i};s} \varvec{{X}}_{\underline{i};r}), \end{aligned}$$
(80)

which is (11).

Given that we have found \(\overline{\mathbf {{X}}}_{r}\) without knowing \(\overline{\mathbf {{X}}}_{s}\), we substitute \(\overline{\mathbf {{X}}}_{r}\) back to (78), further reducing the problem to minimizing \(E'(\mathbf {{X}})\) with respect to \(\mathbf {{X}}_{s}\), which is a quadratic minimization problem, to which the unique solution is given in (10).

1.3 Matrix Exponential and Logarithm in \(\mathcal {{DS}}(n)\)

For a short hand notation, we write \(\xi ( a \varvec{{I}}_{n}+\mathbf {{W}})\) as \(\varvec{{E}}_{n}\). Additionally, the quotient operator is overloaded to denote that \(\frac{\mathbf {{A}}}{\mathbf {{B}}}:=\mathbf {{B}}^{-1}\mathbf {{A}}=\mathbf {{A}}\mathbf {{B}}^{-1}\) if square matrices \(\mathbf {{A}}\) and \(\mathbf {{B}}\) commute and \(\mathbf {{B}}\) is invertible. To derive a closed form for \(\mathrm {e}^{\mathbf {{Y}}}\), we derive a closed form for \(\varvec{{E}}_{n}\) and use basic facts about \(\mathcal {{SO}}(n)\) are summarized in Appendix Sect. “Rotation Group \(\mathcal {{SO}}(n)\)”. One way to find a closed form for \(\varvec{{E}}_{n}\) is to notice that \(\mathbf {{Z}}\xi (\mathbf {{Z}})=\xi (\mathbf {{Z}})\mathbf {{Z}}=e^{\mathbf {{Z}}}-\varvec{{I}}_{n}\), and obtain

$$\begin{aligned} \varvec{{E}}_{n}=\frac{e^{a}e^{\mathbf {{W}}}-\varvec{{I}}_{n}}{ a \varvec{{I}}_{n}+\mathbf {{W}}} \end{aligned}$$
(81)

if \( a \varvec{{I}}_{n}+\mathbf {{W}}\) is invertible. Since the eigenvalues of \(\mathbf {{W}}\) are 0 and complex conjugate pairs \(\pm i\theta _{k}\) (see Appendix Sect. “Rotation Group \(\mathcal {{SO}}(n)\)”), it occurs when \(a\ne 0\).

Conversely, since \((\varvec{{I}}_{n}-\mathbf {{Z}})\eta (\mathbf {{Z}})=\eta (\mathbf {{Z}})(\varvec{{I}}_{n}-\mathbf {{Z}})=-\ln \mathbf {{Z}}\),

$$\begin{aligned} \eta (s\mathbf {{R}})=\frac{(\ln s)\varvec{{I}}_{n}+\ln \mathbf {{R}}}{s\mathbf {{R}}-\varvec{{I}}_{n}} \end{aligned}$$
(82)

if \(s\mathbf {{R}}-\varvec{{I}}_{n}\) is invertible. Similarly, since the eigenvalues of \(\mathbf {{R}}\) are 1 and \(e^{\pm i\theta _{k}}\) (see Appendix Sect. “Rotation Group \(\mathcal {{SO}}(n)\)”), the eigenvalues of \(s\mathbf {{R}}-\varvec{{I}}_{n}\) are \(s-1\) and \(se^{\pm i\theta _{k}}-1\). Hence, \(s\mathbf {{R}}-\varvec{{I}}_{n}\) is invertible unless \(s=1\). We write \(\eta (s\mathbf {{R}})\) as \(\varvec{{L}}_{n}\) hereinafter.

In theory, we can use Eqs. (81) and (82) in computing matrix exponential and logarithm. However, this approach involves a matrix inversion operation which is costly to compute and it only works when the numerator matrix is invertible. More importantly, the forms become numerically unstable when one of the eigenvalues of \(s\mathbf {{R}}-\varvec{{I}}_{n}\) is close to zero.

In what follows, we further simplify (81) and (82) to forms which do not involve matrix inversion when \(n=2,3\), leading to closed forms for \(e^{\mathbf {{Y}}}\) and \(\mathrm {ln}\mathbf {{X}}\) in \(\mathcal {{DS}}(2)\) and \(\mathcal {{DS}}(3)\). It is possible to generalize the work of Gallier and Xu (2002) to find an efficient method for computing \(e^{\mathbf {{Y}}}\) and \(\mathrm {ln}\mathbf {{X}}\) in \(\mathcal {{DS}}(n)\) with \(n\ge 4\), but the work is much more difficult, requiring solving an inverse problem for each computation, thereby is out of this paper’s scope.

Consider a real diagonalizable \(d\)-by-\(d\) matrix \(\mathbf {{Z}}\) for some integer \(d\). There is an efficient approach to compute \(\mathbf {{Z}}^{k}\). Using eigen-value decomposition, we diagonalize \(\mathbf {{Z}}=\mathbf {Q}\mathrm {diag}(\mathsf {w}_{1},\ldots ,\mathsf {w}_{d})\mathbf {Q}^{\mathrm {H}},\) where \((\mathsf {w}_{i})_{i=1}^{d}\) are complex eigenvalues, \(\mathbf {Q}^{\mathrm {H}}\) is the conjugate transpose of \(\mathbf {Q}\), and \(\mathbf {Q}\) is a unitary matrix, with each column \(\mathbf {Q}_{:,i}\) being an eigenvector corresponding to the eigenvalue \(\mathsf {w}_{i}\). Then \(\mathbf {{Z}}^{k}=\mathbf {Q}\mathrm {diag}(\mathsf {w}_{1}^{k},\ldots ,\mathsf {w}_{d}^{k})\mathbf {Q}^{\mathrm {H}}\).

The approach can be generalized to computing a matrix series. Let \(f(z)=\sum _{k=0}^{\infty }\mathsf {a}_{k}z^{k}\) be a series over a complex number \(z\). Let \(\mathbf {F}(\mathbf {{Z}})=\sum _{k=0}^{\infty }\mathsf {a}_{k}\mathbf {Z}^{k}\) be its matrix version. If \(\mathbf {{Z}}\) is diagonalizable then \(\mathbf {F}(\mathbf {{Z}})=\mathbf {Q}\mathrm {diag}(f(\mathsf {w}_{1}),\ldots , f(\mathsf {w}_{d}))\mathbf {Q}^{\mathrm {H}}\).

We notice that the complex version of \(\xi (\cdot )\) in (19) leads to a closed form,

$$\begin{aligned} \xi (z)=\sum _{k=0}^{\infty }\frac{z^{k}}{(k+1)!} =\frac{e^{z}-1}{z}, \end{aligned}$$
(83)

which leads to obtaining \(\varvec{{E}}_{n}=\xi ( a \varvec{{I}}_{n}+\mathbf {{W}})\) via diagonalizing \( a \varvec{{I}}_{n}+\mathbf {{W}}\). Similarly, the complex version of \(\eta (\cdot )\) in (21) leads to another closed form,

$$\begin{aligned} \eta (z)=\sum _{k=0}^{\infty }\frac{\left( 1-z\right) ^{k}}{k+1} =\frac{\ln z}{z-1}, \end{aligned}$$
(84)

also suggesting us to obtain \(\varvec{{L}}_{n}=\eta (s\mathbf {{R}})\) (if it converges) via diagonalizing \(s\mathbf {{R}}\).

We derive closed forms for \(e^{\mathbf {{Y}}}\) and \(\mathrm {ln}\mathbf {{X}}\) in \(\mathcal {{DS}}(2)\) using this approach. The same idea is used with some extra work to derive closed forms for \(e^{\mathbf {{Y}}}\) and \(\mathrm {ln}\mathbf {{X}}\) in \(\mathcal {{DS}}(3)\). The convergence condition for \(\mathrm {ln}\mathbf {{X}}\) turns out to be the same as that for \(\mathrm {ln}\mathbf {{R}}\) (with \(\mathbf {{R}}\in \mathcal {{SO}}(n)\) and \(n=2,3\)): the rotation angle does not exceed 180°.

1.3.1 Matrix Exponential and Logarithm in \(\mathcal {{DS}}(2)\)

Let \(\mathbf {{Y}}={ {M} \left( a \varvec{{I}}_{2}+\mathbf {{W}},\mathbf {{u}}\right) }\in \mathfrak {{ds}}(2)\) be the matrix whose matrix exponential is to be computed. Since \(\mathbf {{W}}\) is a 2-by-2 skew-symmetric matrix, \(\mathbf {{W}}=\left[ \begin{array}{cc} 0 &{} -\theta \\ \theta &{} 0 \end{array}\right] \) for some \(\theta \in \mathbb {R}\). We rewrite \(\mathbf {{W}}\) as

$$\begin{aligned} \mathbf {{W}}=\mathbf {Q}\mathbf {\mathrm {diag}}(i\theta ,-i\theta ) \mathbf {Q}^{\mathrm {H}}, \end{aligned}$$
(85)

where \(\mathbf {Q}=(\mathbf {v},\bar{\mathbf {v}})\), \(\mathbf {v}=\frac{1}{\sqrt{2}}(1,-i)^{\mathrm {T}}\), and \(\bar{\mathbf {v}}=\frac{1}{\sqrt{2}}(1,i)^{\mathrm {T}}\) is the complex conjugate of \(\mathbf {v}\), leading to \( a \varvec{{I}}_{2}+\mathbf {{W}}=\mathbf {Q}\mathbf {\mathrm {diag}}(a+i\theta ,a-i\theta )\mathbf {Q}^{\mathrm {H}}\). Thus, substituting this equation to (19) and using (83) and to simplify the series, we get:

$$\begin{aligned} \varvec{{E}}_{2}&= \mathbf {Q}\mathbf {\mathrm {diag}}(\xi (a+i\theta ),\xi (a-i\theta ))\mathbf {Q}^{\mathrm {H}}\nonumber \\&= \xi ( a +i\theta )\mathbf {v}\mathbf {v}^{\mathrm {H}}+\xi ( a -i\theta )\bar{\mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}}. \end{aligned}$$
(86)

Directly calculating the real part and the imaginary part of \(\xi ( a +i\theta )=\frac{e^{ a +i\theta }-1}{ a +i\theta }\) gives \(\xi ( a +i\theta )=:\xi _{\mathfrak {r}}+i\xi _{\mathrm {\mathfrak {i}}}\), where \(\xi _{\mathfrak {r}}\) and \(\xi _{\mathrm {\mathfrak {i}}}\) are given in (24) and (25) respectively. Since \(\xi ( a +i\theta )\) is the complex conjugate of \(\xi ( a -i\theta )\), it follows that

$$\begin{aligned} \varvec{{E}}_{2}=\xi _{\mathfrak {r}}\left( \mathbf {v}\mathbf {v}^{\mathrm {H}}+\bar{\mathbf {v}} \bar{\mathbf {v}}^{\mathrm {H}}\right) +i\xi _{\mathrm {\mathfrak {i}}}\left( \mathbf {v}\mathbf {v} ^{\mathrm {H}}-\bar{\mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}}\right) . \end{aligned}$$
(87)

By the definition of \(\mathbf {v}\) and \(\bar{\mathbf {v}}\), \(\mathbf {v}\mathbf {v}^{\mathrm {H}}+\bar{\mathbf {v}}\bar{\mathbf {v}}^ {\mathrm {H}}=\varvec{{\hat{E}}}_{1,1}+\varvec{{\hat{E}}}_{2,2}\) and \(\mathbf {v}\mathbf {v}^{\mathrm {H}}-\bar{\mathbf {v}}\bar{\mathbf {v}}^ {\mathrm {H}}=i(\varvec{{\hat{E}}}_{1,2}-\varvec{{\hat{E}}}_{2,1})\) (\(\varvec{{\hat{E}}}_{i,j}\) are defined in Appendix Sect. “Rotation Group \(\mathcal {{SO}}(n)\)”). This gives a closed form for \(\mathrm {e}^{\mathbf {{Y}}}\), as shown in (23).

Similarly, given \(\mathbf {{X}}= m \left( s\mathbf {{R}},\mathbf {{t}}\right) \in \mathcal {{DS}}(2)\), to derive a closed form for \(\mathrm {ln}\mathbf {{X}}\), we need a closed form for \(\varvec{{L}}_{2}=\eta (s\mathbf {{R}})\). Let \(\mathbf {{W}}:=\mathrm {ln}\mathbf {{R}}\) be the principal matrix logarithm of rotation matrix \(\mathbf {{R}}\). Suppose we have diagonalized \(\mathbf {{W}}\) as above. Taking the matrix exponential of \(\mathbf {{W}}\) via the expanded version in (85), we obtain \(\mathbf {{R}}=\mathbf {Q}\mathbf {\mathrm {diag}}(e^{i\theta },e^{-i\theta })\mathbf {Q}^{\mathrm {H}}\). This leads to a closed form for computing \(\theta \) from \(\mathbf {{R}}\): \(\theta =\arctan (\mathbf {{R}}_{2,1}/\mathbf {{R}}_{1,1})\). More importantly, as we substitute this diagonalized form of \(\mathbf {{R}}\) to (21) using (84) to simply the series, we get:

$$\begin{aligned} \varvec{{L}}_{2}&= \mathbf {Q}\mathbf {\mathrm {diag}}(\eta (se^{i\theta }),\eta (se^{-i\theta }))\mathbf {Q}^{\mathrm {H}}\nonumber \\&= \eta (se^{i\theta })\mathbf {v}\mathbf {v}^{\mathrm {H}}+\eta (se^{-i\theta })\bar{\mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}}. \end{aligned}$$
(88)

By directly calculating the real part and the imaginary part of \(\eta (se^{i\theta })=\frac{\ln s+i\theta }{se^{i\theta }-1}=:\eta _{\mathrm {\mathfrak {r}}}+i\eta _{\mathrm {\mathfrak {i}}}\), we get closed forms for \(\eta _{\mathrm {\mathfrak {r}}}\) and \(\eta _{\mathrm {\mathfrak {i}}}\) as shown in (28) and (29) respectively. Separating the real part of \(\varvec{{L}}_{2}\) from the imaginary part of \(\varvec{{L}}_{2}\), we also have

$$\begin{aligned} \varvec{{L}}_{2}=\eta _{\mathrm {\mathfrak {r}}}(\mathbf {v}\mathbf {v}^{\mathrm {H}}+\bar{\mathbf {v}} \bar{\mathbf {v}}^{\mathrm {H}})+i\eta _{\mathrm {\mathfrak {i}}}(\mathbf {v}\mathbf {v}^{ \mathrm {H}}-\bar{\mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}}), \end{aligned}$$
(89)

using the same argument as above. Therefore, we have obtained a closed form for \(\varvec{{L}}_{2}\) as shown in (27), and then a closed form for \(\mathrm {ln}\mathbf {{X}}\) as shown in (26).

1.3.2 Matrix Exponential and Logarithm in \(\mathcal {{DS}}(3)\)

Finding closed forms for matrix exponential and logarithm in \(\mathcal {{DS}}(3)\) requires more work. Let \(\mathbf {{Y}}= {M} \left( a \varvec{{I}}_{3}+\mathbf {{W}},\mathbf {{u}}\right) \in \mathfrak {{ds}}(3)\). Suppose \(\mathbf {{W}}\) is decomposed into \(\mathbf {{W}}=\mathbf {Q}\mathrm {diag}(0,i\theta ,-i\theta )\mathbf {Q}^{\mathrm {H}}\) (see Appendix Sect. “Rotation Group \(\mathcal {{SO}}(n)\)”) where \(\mathbf {Q}=(\mathbf {n},\mathbf {v},\bar{\mathbf {v}})\), \(\mathbf {n}\) is the normal vector representing the axis of rotation, and \(\mathbf {v}\) and \(\bar{\mathbf {v}}\) are a pair complex conjugate unit vectors. We can rewrite \(\mathbf {{W}}\) as:

$$\begin{aligned} \mathbf {{W}}=i\theta \mathbf {v}\mathbf {v}^{\mathrm {H}}-i\theta \bar{ \mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}}. \end{aligned}$$
(90)

Dividing both sides of (90) by \(i\theta \) and squaring up the result (noticing that \(\mathbf {v}\) and \(\bar{\mathbf {v}}\) are orthonormal to each other) yields:

$$\begin{aligned} \mathbf {v}\mathbf {v}^{\mathrm {H}}-\bar{\mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}}&= \frac{\mathbf {{W}}}{i\theta },\end{aligned}$$
(91)
$$\begin{aligned} \mathbf {v}\mathbf {v}^{\mathrm {H}}+\bar{\mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}}&= \frac{\mathbf {{W}}^{2}}{-\theta ^{2}}. \end{aligned}$$
(92)

Since \((\mathbf {n},\mathbf {v},\bar{\mathbf {v}})\) form an orthonormal basis, we have:

$$\begin{aligned} \mathbf {n}\mathbf {n}^{\mathrm {T}}+\mathbf {v}\mathbf {v}^{\mathrm {H}}+\bar{\mathbf {v}} \bar{\mathbf {v}}^{\mathrm {H}}=\varvec{{I}}_{3}, \end{aligned}$$
(93)

which leads to

$$\begin{aligned} \mathbf {n}\mathbf {n}^{\mathrm {T}}=\varvec{{I}}_{3}+\frac{\mathbf {{W}}^{2}}{\theta ^{2}}, \end{aligned}$$
(94)

by substituting (92) to it.

Diagonalizing \(\varvec{{E}}_{3}=\xi ( a \varvec{{I}}_{3}+\mathbf {{W}})\), we obtain

$$\begin{aligned} \varvec{{E}}_{3}&= \xi ( a )\mathbf {n}\mathbf {n}^{\mathrm {T}}+\xi ( a +i\theta )\mathbf {v} \mathbf {v}^{\mathrm {H}}+\xi ( a -i\theta )\bar{\mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}}. \end{aligned}$$
(95)

Using the same argument as in the case of \(\mathcal {{DS}}(2)\) for deriving \(\varvec{{E}}_{2}\), it follows that

$$\begin{aligned} \varvec{{E}}_{3}=\frac{e^{ a }-1}{ a }\mathbf {n}\mathbf {n}^{\mathrm {T}}+ \xi _{\mathfrak {r}}(\mathbf {v}\mathbf {v}^{\mathrm {H}}+\bar{\mathbf {v}}\bar{ \mathbf {v}}^{\mathrm {H}})+i\xi _{\mathrm {\mathfrak {i}}}(\mathbf {v}\mathbf {v}^{\mathrm {H}}-\bar{\mathbf { v}}\bar{\mathbf {v}}^{\mathrm {H}}), \end{aligned}$$
(96)

where \(\xi _{\mathfrak {r}}\) and \(\xi _{\mathrm {\mathfrak {i}}}\) are defined in (24) and (25) respectively. Substituting (90), (91) and (92) to (96), we obtain a closed form for \(\varvec{{E}}_{3}\), i.e. (32), leading to a closed form for \(\mathrm {e}^{\mathbf {{Y}}}\), i.e. (30).

Given \(\mathbf {{X}}= m \left( s\mathbf {{R}},\mathbf {{t}}\right) \in \mathcal {{DS}}(3)\), finding a closed form for \(\ln \mathbf {{X}}\) is done similarly. The function \(\varvec{{L}}_{3}=\eta (s\mathbf {{R}})\) gives

$$\begin{aligned} \varvec{{L}}_{3}=\eta (s)\mathbf {n}\mathbf {n}^{\mathrm {T}}+\eta _{\mathrm {\mathfrak {r}}}(\mathbf {v} \mathbf {v}^{\mathrm {H}}+\bar{\mathbf {v}}\bar{\mathbf {v}}^{\mathrm {H}})+ i\eta _{\mathrm {\mathfrak {i}}}(\mathbf {v}\mathbf {v}^{\mathrm {H}}-\bar{\mathbf {v}}\bar{ \mathbf {v}}^{\mathrm {H}}), \end{aligned}$$
(97)

where \(\eta _{\mathrm {\mathfrak {r}}}\) and \(\eta _{\mathrm {\mathfrak {i}}}\) are defined in (28) and (29) respectively. Substituting (94), (91) and (92) to (97), taking into account that \(\eta (s)=\frac{\ln s}{s-1}\), we obtain a closed form for \(\varvec{{L}}_{3}\), (35). A closed form for \(\mathrm {ln}\mathbf {{X}}\) follows, (33).

It can be verified that our closed forms for \(e^{\mathbf {{Y}}}\) and \(\ln \mathbf {{X}}\) are generalizations of those derived for \(\mathcal {{SE}}(3)\) presented in (Agrawal 2006), i.e. when \(s=1\).

1.4 Proof for Theorem 1

Note that in this theorem we only consider input sets \(\{(\varvec{{X}}_{i},\mathsf {w}_{i})\in \mathcal {{DS}}(n)\times \mathbb {R}^{+}\}_{i=1}^{N}\) for which the mean is unique. A recent study from Arnaudon and Miclo (2014) shows that in a complete manifold including \(\mathcal {{DS}}(n)\), the mean induced by the Riemannian distance \(d_{\mathrm {R}}\) via (7) is almost surely unique.

The sum of squared divergences in (6) under \(d_{\mathrm {L}}\) is expressed as:

$$\begin{aligned} \mathcal {E}(\mathbf {{X}})&= \mathsf {w}_{\underline{i}}\left\| \mathrm {ln}( \varvec{{X}}_{\underline{i}}^{-1}\mathbf {{X}})\right\| _{\mathrm {F}}^{2}. \end{aligned}$$
(98)

We rely on the minimal representation \(\phi \) in §2.1 to derive the proof. Let \(l_{\mathbf {{Z}}}(\mathbf {{x}}):=\phi ^{-1}\circ L_{\mathbf {{Z}}}\circ \phi (\mathbf {{x}})\) be the equivalence of the left translation operator \(L_{\mathbf {{Z}}}\) for some \(\mathbf {{Z}}\in \mathcal {{DS}}(n)\) under \(\phi \). By inspection, we establish the following equations:

$$\begin{aligned} l_{\mathbf {{Z}}}(\mathbf {{x}})_{s}&= \mathrm {ln}\mathbf {{Z}}_{s}+\mathbf {{x}}_{s},\end{aligned}$$
(99)
$$\begin{aligned} l_{\mathbf {{Z}}}(\mathbf {{x}})_{r}&= \mathrm {ln}(\mathbf {{Z}}_{r}\mathrm {e}^{\mathbf {{x}}_{r}^{\times }})_{\times },\end{aligned}$$
(100)
$$\begin{aligned} l_{\mathbf {{Z}}}(\mathbf {{x}})_{t}&= \mathbf {{Z}}_{s}\mathbf {{Z}}_{r}\mathbf {{x}}_{t}+\mathbf {{Z}}_{t}. \end{aligned}$$
(101)

In this section only, we work on the inverses \(\varvec{{X}}_{i}^{-1}\) instead of \(\varvec{{X}}_{i}\) themselves so that we can expand \(\mathrm {ln}(\varvec{{X}}_{i}^{-1}\mathbf {{X}})\) easily. Let \( m \left( \mathsf {s}_{i}\varvec{{R}}_{i},\varvec{{t}}_{i}\right) :=\varvec{{X}}_{i}^{-1}\) for all \(i=1,\ldots ,N\). Via \(\phi \), using (20), (99) to (101) and the fact \(\left\langle \varvec{{I}}_{n},\mathrm {ln}(\varvec{{R}}_{i}\mathrm {e}^{\mathbf {{x}}_{r}^{\times }})\right\rangle =0\) since the diagonal part of any skew-symmetric matrix is zero, (98) expands to:

$$\begin{aligned} \mathcal {E}(\phi (\mathbf {{x}}))&= n\mathsf {w}_{\underline{i}}(\mathrm {ln}\mathsf {s}_{\underline{i}} +\mathbf {{x}}_{s})^{2}+\mathsf {w}_{\underline{i}}\left\| \mathrm {ln}(\varvec{{R}}_{\underline{i}}\mathrm {e}^{\mathbf {{x}}_{r}^{\times }})\right\| _{\mathrm {F}}^{2}\nonumber \\&+\mathsf {w}_{\underline{i}}\left\| \eta (\varvec{{Z}}_{\underline{i}}(\mathbf {{x}}))\varvec{{z}}_{\underline{i}}(\mathbf {{x}})\right\| ^{2}, \end{aligned}$$
(102)

where \(\varvec{{Z}}_{i}(\mathbf {{x}}):=\mathsf {s}_{i}\mathrm {e}^{\mathbf {{x}}_{s}}\varvec{{R}}_{i}\mathrm {e}^{\mathbf {{x}}_{r}^{\times }}\) and \(\varvec{{z}}_{i}(\mathbf {{x}}):=\mathsf {s}_{i}\varvec{{R}}_{i}\mathbf {{x}}_{t}+\varvec{{t}}_{i}\) for all \(i=1,\ldots ,N\), and \(\eta (\cdot )\) is defined in (21).

1.4.1 Non-translation-Compatibility

Without loss of generality, we set \((\mathsf {s}_{i},\varvec{{R}}_{i})=(\bar{s},\bar{\mathbf {{R}}})\) for all \(i=1,\ldots ,N\) and some constant \(\bar{s}>0\) and \(\bar{\mathbf {{R}}}\in \mathcal {{SO}}(n)\). Instead of finding the mean \(\overline{\mathbf {{X}}}\), we prove that for any \(\mathbf {{t}}\in \mathbb {R}^{n}\), \( m \left( \bar{s}^{-1}\bar{\mathbf {{R}}}^{\mathrm {T}},\mathbf {{t}}\right) \) cannot be a mean (note that we are working with \(\varvec{{X}}_{i}^{-1}\)). If this is the case, the actual mean(s) cannot be translation-compatible.

Differentiating (102) with respect to a variable \(\mathbf {{x}}_{s}\) yields:

$$\begin{aligned} \frac{\partial E\circ \phi }{\partial \mathbf {{x}}_{s}}=2n\mathsf {w}_{\underline{i}}( \mathbf {{x}}_{s}+\mathrm {ln}\mathsf {s}_{\underline{i}})+\mathsf {w}_{\underline{i}}\varvec{{z}}_{\underline{i}}(\mathbf {{x}})^{\mathrm {T}}\eta '_{s}(\varvec{{Z}}_{\underline{i}}(\mathbf {{x}}))\varvec{{z}}_{\underline{i}}(\mathbf {{x}}), \end{aligned}$$
(103)

where \(\eta '_{s}(\mathbf {{Z}}_{i}(\mathbf {{x}}))\) is given by (omitting the variable \(\mathbf {{x}}\)):

$$\begin{aligned} \eta '_{s}(\varvec{{Z}}_{i}):=\frac{\partial \eta }{\partial \mathbf {{x}}_{s}}(\varvec{{Z}}_{i})^{\mathrm {T}}\eta (\varvec{{Z}}_{i})+\eta (\varvec{{Z}}_{i})^{\mathrm {T}}\frac{\partial \eta }{\partial \mathbf {{x}}_{s}}(\varvec{{Z}}_{i}). \end{aligned}$$
(104)

Since \(\frac{\partial \varvec{{Z}}_{i}}{\partial \mathbf {{x}}_{s}}(\mathbf {{x}})=\varvec{{Z}}_{i}(\mathbf {{x}})\) for all \(i=1,\ldots ,N\), we get:

$$\begin{aligned} \frac{\partial \eta \circ \varvec{{Z}}_{i}}{\partial \mathbf {{x}}_{s}}&= \frac{\partial }{\partial \mathbf {{x}}_{s}}\sum _{k=1}^{\infty }\frac{(\varvec{{I}}_{n}-\varvec{{Z}}_{i})^{k}}{k+1}\nonumber \\&= \sum _{k=1}^{\infty }\sum _{l=1}^{k}\frac{-1}{k+1}(\varvec{{I}}_{n}-\varvec{{Z}}_{i})^{l-1}\varvec{{Z}}_{i}(\varvec{{I}}_{n}-\varvec{{Z}}_{i})^{k-l}.\nonumber \\&= \sum _{k=1}^{\infty }\frac{-k}{k+1}(\varvec{{I}}_{n}-\varvec{{Z}}_{i})^{k-1}\varvec{{Z}}_{i}. \end{aligned}$$
(105)

where the last equation holds because \(\varvec{{Z}}_{i}\) commutes with \((\varvec{{I}}_{n}-\varvec{{Z}}_{i})^{l}\) for all integer \(l\).

Let \(\tilde{\mathbf {{x}}}:=\phi ^{-1}\left( m \left( \bar{s}^{-1}\bar{\mathbf {{R}}}^{\mathrm {T}},\mathbf {{t}}\right) \right) \) for an arbitrary translation \(\mathbf {{t}}\in \mathbb {R}^{n}\). By definition, \(\varvec{{Z}}_{i}(\tilde{\mathbf {{x}}})=\varvec{{I}}_{n}\) for all \(i=1,\ldots ,N\). At point \(\mathbf {{x}}=\tilde{\mathbf {{x}}}\), we apply (105) to find the derivative of \(\eta (\varvec{{Z}}_{i})\), and (21) to evaluate \(\eta (\varvec{{Z}}_{i})\) itself, we get:

$$\begin{aligned} \eta \circ \varvec{{Z}}_{i}(\tilde{\mathbf {{x}}})&= \varvec{{I}}_{n},\end{aligned}$$
(106)
$$\begin{aligned} \frac{\partial \eta \circ \varvec{{Z}}_{i}}{\partial \mathbf {{x}}_{s}}(\tilde{\mathbf {{x}}})&= -0.5\varvec{{Z}}_{i}=-0.5\varvec{{I}}_{n}. \end{aligned}$$
(107)

It follows from (104) that:

$$\begin{aligned} \eta '_{s}(\varvec{{Z}}_{i}(\tilde{\mathbf {{x}}}))=-\varvec{{I}}_{n}. \end{aligned}$$
(108)

Substituting the equation back to (103), we obtain:

$$\begin{aligned} \frac{\partial E\circ \phi }{\partial \mathbf {{x}}_{s}}(\tilde{\mathbf {{x}}})=-\mathsf {w}_{\underline{i}}\varvec{{z}}_{\underline{i}}(\tilde{\mathbf {{x}}})^{\mathrm {T}}\varvec{{z}}_{\underline{i}}(\tilde{\mathbf {{x}}})=-\mathsf {w}_{\underline{i}}\left\| \varvec{{z}}_{\underline{i}}(\tilde{\mathbf {{x}}})\right\| ^{2}. \end{aligned}$$
(109)

Clearly, the right-hand side of the above equation is always negative. Because the partial derivative \(\frac{\partial E\circ \phi }{\partial \mathbf {{x}}_{s}}\) at \(\tilde{\mathbf {{x}}}\) does not vanish, no direct similarity \( m \left( \bar{s}^{-1}\bar{\mathbf {{R}}}^{\mathrm {T}},\mathbf {{t}}\right) \) can be a mean, proving the actual mean is not translation-compatible.

1.4.2 Scale-Compatibility and Rotation-Compatibility

To verify scale-compatibility and rotation-compatibility, we set \(\varvec{{t}}_{i}:=-\mathsf {s}_{i}^{-1}\varvec{{R}}_{i}^{\mathrm {T}}\bar{\mathbf {{t}}}\) for all \(i=1,\ldots ,N\) and some constant \(\bar{\mathbf {{t}}}\in \mathbb {R}^{n}\) (so that all \(\varvec{{X}}_{i}\)’s translation components equal \(\bar{\mathbf {{t}}}\)).

Differentiating \(E\circ \phi (\mathbf {{x}})\) with respect to \(\mathbf {{x}}_{t}\) yields:

$$\begin{aligned} \frac{\partial E\circ \phi }{\partial \mathbf {{x}}_{t}}=2\mathsf {w}_{\underline{i}}\varvec{{z}}_{\underline{i}}^{\mathrm {T}}\eta (\varvec{{Z}}_{\underline{i}})^{\mathrm {T}}\eta (\varvec{{Z}}_{\underline{i}})\mathsf {s}_{\underline{i}}\varvec{{R}}_{\underline{i}} . \end{aligned}$$
(110)

It immediately follows that the derivative \(\frac{\partial E\circ \phi }{\partial \mathbf {{x}}_{t}}\) only vanishes when \(\mathbf {{x}}_{t}=\bar{\mathbf {{t}}}\), at which point the third term of (102) also vanishes. When this occurs, \(\overline{\mathbf {{X}}}_{s}\) becomes a geometric mean, and \(\overline{\mathbf {{X}}}_{r}\) becomes the mean of rotations under the intrinsic Riemannian distance (Park and Ravani 1997). Hence, the scale-compatibility and rotation-compatibility properties are verified.

1.5 Proof for Theorem 2

We will prove the third statement, i.e. for translation-compatibility. The other two statements follow analogously.

Suppose all direct similarities are written as \(\varvec{{x}}_{i}:=\phi ^{-1}(\varvec{{X}}_{i})\), and the mean of them \(\bar{\mathbf {{X}}}\) is written as \(\bar{\mathbf {{x}}}:=\phi ^{-1}(\bar{\mathbf {{X}}})\) under the map \(\phi \). Denote by \(\mathbf {{x}}_{sr}\) the first \(n_{r}+1\) components of \(\mathbf {{x}}\). Translation compatibility means that if for all \(i=1,\ldots ,N\), \(\varvec{{x}}_{i;sr}=\mathbf {{t}}'\) for some constant \(\mathbf {{t}}'\in \mathbb {R}^{n_{r}+1}\), then \(\bar{\mathbf {{x}}}_{sr}=\mathbf {{t}}'\). We will prove that the following statements are equivalent to each other.

  • A: every unique mean induced by \(g\) is translation-compatible,

  • B: \(\mathcal {{T}}(n)\) is totally geodesic in \(\mathcal {{DS}}(n)\).

1.5.1 From A to B

Choose any two points \(\mathbf {{x}},\mathbf {{y}}\in \mathbb {R}^{n_{ds}}\) such that \(\mathbf {{x}}_{sr}=\mathbf {{y}}_{sr}=\mathbf {{0}}\) and that \(\phi (\mathbf {{x}})\) and \(\phi (\mathbf {{y}})\) are within the injectivity radius of each other. Suppose \(\gamma (u)\) is the \(g\)-geodesic with the arc-length parameterization going from \(\gamma (0)=\phi (\mathbf {{x}})\) to \(\gamma (a)=\phi (\mathbf {{y}})\) for some constant \(a>0\). It suffices to show that \(\phi \circ \gamma (u)_{i}=0\) for all \(u\in [0,a]\) and all \(i\in \mathcal {{J}}_{s}\cup \mathcal {{J}}_{r}\).

Suppose there is a number \(\mathsf {u}_{0}\in (0,a)\) and a dimension \(i\in \mathcal {{J}}_{s}\cup \mathcal {{J}}_{r}\) such that \(\phi \circ \gamma (\mathsf {u}_{0})_{i}\ne 0\). Let \(\mathsf {u}_{1}<\mathsf {u}_{0}\) be the largest number such that \(\phi \circ \gamma (\mathsf {u}_{1})_{i}=0\) and \(\mathsf {u}_{2}>\mathsf {u}_{0}\) be the smallest number such that \(\phi \circ \gamma (\mathsf {u}_{2})_{i}=0\). Then, for any \(u\in (\mathsf {u}_{1},\mathsf {u}_{2})\), we must have \(\phi \circ \gamma (u)_{i}\ne 0\). However, \(\gamma (\frac{\mathsf {u}_{1}+\mathsf {u}_{2}}{2})\) is the mean of \(\gamma (\mathsf {u}_{1})\) and \(\gamma (\mathsf {u}_{2})\). Hence, by our translation-compatibility definition, \(\phi \circ \gamma (\frac{\mathsf {u}_{1}+\mathsf {u}_{2}}{2})_{i}=0\), leading to a contradiction.

Therefore, \(\mathcal {{T}}(n)\) is totally geodesic in \(\mathcal {{DS}}(n)\).

1.5.2 From B to A

We only have to show that the mean is translation-compatible when all direct similarities \(\varvec{{X}}_{i}\) are in \(\mathcal {{T}}(n)\). For other cases, the direct similarities must live in a coset \(\mathbf {{Z}}\mathcal {{T}}(n):=\{\mathbf {{Z}}\mathbf {{X}}:\mathbf {{X}}\in \mathcal {{DS}}(n)\}\) of \(\mathcal {{T}}(n)\) for some \(\mathbf {{Z}}\in \mathcal {{DS}}(n)\). In these cases, we left-translate the direct similarities by \(\mathbf {{Z}}^{-1}\), compute the mean, left-translate it by \(\mathbf {{Z}}\), and obtain a translation-compatible mean because the metric tensor \(g\) is left-invariant.

Suppose the induced metric tensor of \(g\) on \(\mathcal {{T}}(n)\) is \(g'\). Let \(d_{\mathrm {R}}\) and \(d_{\mathrm {R}}'\) denote the Riemannian distance in \(\mathcal {{DS}}(n)\) and \(\mathcal {{T}}(n)\) respectively. In fact, \(d_{\mathrm {R}}'\) is just the restricted version of \(d_{\mathrm {R}}\) on \(\mathcal {{T}}(n)\). Let \(\bar{\mathbf {{X}}}'\) be the mean induced by \(d_{\mathrm {R}}'\) restricted to \(\mathcal {{T}}(n)\), which is automatically translation-compatible. It suffices to prove that \(\bar{\mathbf {{X}}}'=\bar{\mathbf {{X}}}\).

Let \(f_{i}'(\mathbf {{X}}):=d_{\mathrm {R}}'(\mathbf {{X}},\varvec{{X}}_{i})^{2}\) be the function that measures the restricted squared Riemannian distance between any translation \(\mathbf {{X}}\in \mathcal {{T}}(n)\) and a given direct similarity \(\varvec{{X}}_{i}\). Let \(f_{i}(\mathbf {{X}}):=d_{\mathrm {R}}(\mathbf {{X}},\varvec{{X}}_{i})^{2}\) be the corresponding version in \(\mathcal {{DS}}(n)\). By definition,

$$\begin{aligned} \bar{\mathbf {{X}}}'&= \mathop {\hbox {argmax}}\limits _{\mathbf {{X}}\in \mathcal {{T}}(n)}\mathsf {w}_{\underline{i}}f_{\underline{i}}'(\mathbf {{X}}),\end{aligned}$$
(111)
$$\begin{aligned} \bar{\mathbf {{X}}}&= \mathop {\hbox {argmax}}\limits _{\mathbf {{X}}\in \mathcal {{DS}}(n)}\mathsf {w}_{\underline{i}}f_{\underline{i}}(\mathbf {{X}}). \end{aligned}$$
(112)

According to (37), the gradient of \(f_{i}'(\mathbf {{X}})\) is minus twice the velocity of the \(g'\)-geodesic \(\gamma \) that starts at \(\gamma (0)=\mathbf {{X}}\) and ends at \(\gamma (1)=\varvec{{X}}_{i}\). Since this geodesic is also the \(g\)-geodesic connecting \(\mathbf {{X}}\) with \(\varvec{{X}}_{i}\), we must have \(\mathrm {grad}f_{i}'=\mathrm {grad}f_{i}.\) Hence \(\mathrm {grad}(\mathsf {w}_{\underline{i}}f_{\underline{i}}')= \mathrm {grad}(\mathsf {w}_{\underline{i}}f_{\underline{i}})\). This implies that both gradients vanish concurrently. In other words, a \(g'\)-mean is also a \(g\)-mean.

This statement alone does not ensure that a \(g\)-mean is a \(g'\)-mean. However, because we restrict ourselves to the case that the \(g\)-mean is unique, if \(g'\)-mean exists, it must be the unique \(g\)-mean. Clearly \(g'\)-mean exists because the variance (6) is lower-bounded by 0 and \(\mathcal {{T}}(n)\) is locally compact.

Therefore, \(\bar{\mathbf {{X}}}'=\bar{\mathbf {{X}}}\).

1.6 Proof for Theorem 3

First, we determine the expression of the metric tensor at each element \(\mathbf {{X}}\in \mathcal {{DS}}(n)\). Let \(\mathbf {{U}}\) be a tangent vector at \(\mathbf {{X}}\) with local coordinates \(\mathbf {{u}}:=\varphi _{\mathbf {{x}}}(\mathbf {{U}})\). Let \(\mathbf {{U}}':=dL_{\mathbf {{X}}^{-1}}(\mathbf {{U}})=\mathbf {{X}}^{-1}\mathbf {{U}}\) and let \(\mathbf {{u}}':=\varphi _{\mathbf {{0}}}(\mathbf {{U}}')\) be its coordinates. By inspection, the linear relationship between \(\mathbf {{u}}\) and \(\mathbf {{u}}'\) is given by:

$$\begin{aligned} \mathbf {{u}}'_{s}&= \mathbf {{u}}_{s},\end{aligned}$$
(113)
$$\begin{aligned} \mathbf {{u}}'_{r}&= \tilde{\Psi }_{\mathbf {{x}}_{r}}\mathbf {{u}}_{r},\end{aligned}$$
(114)
$$\begin{aligned} \mathbf {{u}}'_{t}&= \mathrm {e}^{-\mathbf {{x}}_{s}}\mathrm {e}^{-\mathbf {{x}}_{r}^{\times }}\mathbf {{u}}_{t}. \end{aligned}$$
(115)

In other words, \(\mathbf {{u}}'=\mathbf {{D}}_{\mathbf {{x}}}\mathbf {{u}}\), where \(\mathbf {{D}}_{\mathbf {{x}}}\) is a block-diagonal matrix:

$$\begin{aligned} \mathbf {{D}}_{\mathbf {{x}}}:=\left[ \begin{array}{ccc} 1 &{} \mathbf {{0}}&{} \mathbf {{0}}\\ \mathbf {{0}}&{} \tilde{\Psi }_{\mathbf {{x}}_{r}} &{} \mathbf {{0}}\\ \mathbf {{0}}&{} \mathbf {{0}}&{} \mathrm {e}^{-\mathbf {{x}}_{s}}\mathrm {e}^{-\mathbf {{x}}_{r}^{\times }} \end{array}\right] . \end{aligned}$$
(116)

With this, the metric tensor Eq. (39) at \(\mathbf {{X}}=\phi (\mathbf {{x}})\) has a short form, \(g_{\mathbf {{X}}}(\mathbf {{U}},\mathbf {{V}})=\varphi _{\mathbf {{x}}}(\mathbf {{U}})^{\mathrm {T}}\mathbf {{G}}(\mathbf {{x}})\varphi _{\mathbf {{x}}}(\mathbf {{V}})\), where

$$\begin{aligned} \mathbf {{G}}(\mathbf {{x}}):=\mathbf {{D}}_{\mathbf {{x}}}^{\mathrm {T}}\tilde{\mathbf {{G}}}\mathbf {{D}}_{\mathbf {{x}}}. \end{aligned}$$
(117)

Next, pick two arbitrary coordinates \(\mathbf {{p}},\mathbf {{q}}\in \mathbb {R}^{n_{ds}}\) under the map \(\phi \) such that \(\mathbf {{p}}\ne \mathbf {{q}}\) and their first \(n_{r}+1\) components are zero: \(\mathbf {{p}}_{sr}=\mathbf {{q}}_{sr}=\mathbf {{0}}\). Consider the geodesic \(\gamma \) going from \(\gamma (0)=\phi (\mathbf {{p}})\) to \(\gamma (1)=\phi (\mathbf {{q}})\). Let its coordinate functions be \(\mathbf {{x}}(u):=\phi \circ \gamma (u)\). We will prove that \(\mathcal {{T}}(n)\) is not totally geodesic in \(\mathcal {{DS}}(n)\) by showing that no such \(\gamma \) exists that also satisfy \(\mathbf {{x}}(u)_{sr}=\mathbf {{0}}\) for all \(u\in (0,1)\).

In Riemannian gemetry (e.g. see (Lee 1997)), geodesics obey the following geodesic equations, for all \(k=1,\ldots ,n_{ds}\):

$$\begin{aligned} \ddot{\mathbf {{x}}}_{k}(u)+\dot{\mathbf {{x}}}_{\underline{i}}(u)\dot{\mathbf {{x}}}_{\underline{j}}(u) \Gamma _{\underline{i},\underline{j};k}(\mathbf {{x}}(u))=0, \end{aligned}$$
(118)

where \(\ddot{\mathbf {{x}}}\) and \(\dot{\mathbf {{x}}}\) are respectively first-order and second-order derivatives of \(\mathbf {{x}}\), and \(\Gamma _{i,\underline{j};k}\) are Christoffel symbols related to the local coordinates of the metric tensor via, for all \(i,j,k=1,\ldots ,n_{ds}\):

$$\begin{aligned} \Gamma _{i,j;\underline{l}}(\mathbf {{x}})\mathbf {{G}}(\mathbf {{x}})_{\underline{l},k}=0.5(\partial _{i}\mathbf {{G}}(\mathbf {{x}})_{j,k}+\partial _{j}\mathbf {{G}}(\mathbf {{x}})_{i,k}-\partial _{k}\mathbf {{G}}(\mathbf {{x}})_{i,j}), \end{aligned}$$
(119)

where \(\partial _{i}\) for all \(i=1,\ldots ,n_{ds}\) are partial derivative operators.

Under the extra condition that \(\mathbf {{x}}(u)_{sr}=\mathbf {{0}}\), we get \(\dot{\mathbf {{x}}}(u)_{sr}=\mathbf {{0}}\) and \(\ddot{\mathbf {{x}}}(u)_{sr}=\mathbf {{0}}\), for all \(u\in (0,1)\). The first geodesic equation (\(k=1\)) of (118) simplifies to:

$$\begin{aligned} \sum _{i\in \mathcal {{J}}_{t}}\sum _{j\in \mathcal {{J}}_{t}}\dot{\mathbf {{x}}}_{i}(u)\dot{\mathbf {{x}}}_{j}(u)\Gamma _{i,j;1}(\mathbf {{x}}(u))=0. \end{aligned}$$
(120)

To find \(\Gamma _{i,j;1}(\mathbf {{x}}(u))\) for all \(i,j\in \mathcal {{J}}_{t}\), we substitute (117) to (119) and obtain \(\Gamma _{i,j;1}(\mathbf {{x}}(u))=\mathbf {{H}}(u)_{i,j},\) where:

$$\begin{aligned} \mathbf {{H}}(u):=2\mathrm {e}^{-2\mathbf {{x}}_{s}}\mathrm {e}^{\mathbf {{x}}_{r}^{\times }}\tilde{\mathbf {{G}}}_{tt}\mathrm {e}^{-\mathbf {{x}}_{r}^{\times }}, \end{aligned}$$
(121)

and \(\tilde{\mathbf {{G}}}_{tt}\) is the \(n\)-by-\(n\) bottom-right submatrix of \(\tilde{\mathbf {{G}}}\) (the matrix representing the metric tensor restricted to translation only). We rewrite (120):

$$\begin{aligned} \mathbf {{v}}_{t}(u)^{\mathrm {T}}\mathbf {{H}}(u)\mathbf {{v}}_{t}(u)=0. \end{aligned}$$
(122)

Analyzing \(\mathbf {{H}}(u)\), we realize that this matrix is symmetric positive-definite since \(\tilde{\mathbf {{G}}}_{tt}\) is symmetric positive-definite, \(\mathrm {e}^{\mathbf {{x}}_{r}^{\times }}\) is a rotation matrix whose inverse is \(\mathrm {e}^{-\mathbf {{x}}_{r}^{\times }}\), and \(\mathrm {e}^{-2\mathbf {{x}}_{s}}>0\). Thus, (122) holds if and only if \(\mathbf {{v}}_{t}(u)=\mathbf {{0}}\) for all \(u\in (0,1)\). Clearly, this is not possible because otherwise, we must have \(\mathbf {{x}}_{t}(0)=\mathbf {{x}}_{t}(1)\) (since \(\dot{\mathbf {{x}}}_{t}(u)=\mathbf {{v}}_{t}(u))\), leading to \(\mathbf {{p}}=\mathbf {{q}}\).

Therefore, any geodesic from \(\phi (\mathbf {{p}})\) to \(\phi (\mathbf {{q}})\) must not lie entirely in \(\mathcal {{T}}(n)\), proving \(\mathcal {{T}}(n)\) is not totally geodesic in \(\mathcal {{DS}}(n)\).

1.7 Proof for Theorem 4

To prove that \(d_{\alpha }\) is left-invariant, we show that \(d_{\alpha }\) is related to a pseudo-seminorm by the formula \(d_{\alpha }(\mathbf {{X}},\mathbf {{Y}})=h_{\alpha }(\mathbf {{X}}^{-1}\mathbf {{Y}})\), where

$$\begin{aligned} h_{\alpha }(\mathbf {{Z}})&:= \sqrt{\frac{(\mathrm {ln}\mathbf {{Z}}_{s})^{2}}{\sigma _{s}^{2}}+\frac{\left\| \mathbf {{Z}}_{r}-\varvec{{I}}_{n}\right\| _{\mathrm {F}}^{2}}{\sigma _{r}^{2}}+\frac{\left\| \mathbf {{Z}}_{t}\right\| ^{2}}{\sigma _{t}^{2}\mathbf {{Z}}_{s}^{1-\alpha }}},\end{aligned}$$
(123)
$$\begin{aligned} \mathbf {{X}}^{-1}\mathbf {{Y}}&= m \left( \frac{\mathbf {{Y}}_{s}}{\mathbf {{X}}_{s}}\mathbf {{X}}_{r}^{\mathrm {T}}\mathbf {{Y}}_{r},\frac{\mathbf {{X}}_{r}^ {\mathrm {T}}(\mathbf {{Y}}_{t}-\mathbf {{X}}_{t})}{\mathbf {{X}}_{s}}\right) . \end{aligned}$$
(124)

Evaluating \(h_{\alpha }(\mathbf {{X}}^{-1}\mathbf {{Y}})^{2}\) yields:

$$\begin{aligned} h_{\alpha }(\mathbf {{X}}^{-1}\mathbf {{Y}})^{2}&= \frac{(\mathrm {ln}\mathbf {{Y}}_{s}-\mathrm {ln}\mathbf {{X}}_{s})^{2}}{\sigma _{s}^{2}}+ \frac{\left\| \mathbf {{X}}_{r}^{\mathrm {T}}\mathbf {{Y}}_{r}-\varvec{{I}}_{n}\right\| _{\mathrm {F}}^{2}}{\sigma _{r}^{2}}\nonumber \\&+\frac{\left\| \mathbf {{X}}_{r}^{\mathrm {T}}(\mathbf {{Y}}_{t}-\mathbf {{X}}_{t})\right\| ^{ 2}}{\sigma _{t}^{2}\mathbf {{X}}_{s}^{1+\alpha }\mathbf {{Y}}_{s}^{1-\alpha }}=d_{\alpha }(\mathbf {{X}},\mathbf {{Y}})^{2}, \end{aligned}$$
(125)

where the last equation holds because the Frobenius norm and the vector norm are rotation-invariant. Since \(\mathbf {{X}}^{-1}\mathbf {{Y}}=(\mathbf {{Z}}\mathbf {{X}})^{-1}(\mathbf {{Z}}\mathbf {{Y}})\), it follows that \(h_{\alpha }(\mathbf {{X}}^{-1}\mathbf {{Y}})=h_{\alpha }((\mathbf {{Z}}\mathbf {{X}})^{-1}(\mathbf {{Z}}\mathbf {{Y}}))\), proving \(d_{\alpha }\) is left-invariant.

1.8 Proof for Lemma 6

The weighted sum of squared divergences in (6) can be rewritten as:

$$\begin{aligned} \mathsf {w}_{\underline{i}}d_{\alpha }(\varvec{{X}}_{\underline{i}}, \mathbf {{X}})^{2}=\frac{E_{s}(\mathbf {{X}})}{\sigma _{s}^{2}}+\frac{E_{r}(\mathbf {{X}})}{ \sigma _{r}^{2}}+\frac{E_{t;\alpha }(\mathbf {{X}})}{\sigma _{t}^{2}}, \end{aligned}$$
(126)

where \(E_{s}(\mathbf {{X}})=\mathsf {w}_{\underline{i}}d_{s}(\varvec{{X}}_{\underline{i}},\mathbf {{X}})^{2}\), \(E_{r}(\mathbf {{X}})=\mathsf {w}_{\underline{i}}d_{r}(\varvec{{X}}_{\underline{i}},\mathbf {{X}})^{2}\), and \(E_{t;\alpha }(\mathbf {{X}})=\mathsf {w}_{\underline{i}}d_{t;\alpha }( \varvec{{X}}_{\underline{i}},\mathbf {{X}})^{2}\). Since \(\mathbf {{X}}_{r}\) only appears in \(E_{r}(\mathbf {{X}})\), we obtain

$$\begin{aligned} \overline{\mathbf {{X}}}_{r}&= \mathop {\hbox {argmax}}\limits _{\mathbf {{R}}\in \mathcal {{SO}}(n)}\mathsf {w}_{\underline{i}}\left\| \mathbf {{R}}-\varvec{{X}}_{\underline{i};r}\right\| _{\mathrm {F}}^{2}.\nonumber \\&= \mathrm {sop}(\mathsf {w}_{\underline{i}}\varvec{{X}}_{\underline{i};r}), \end{aligned}$$
(127)

where the last equation follows from (79). Likewise, since \(\mathbf {{X}}_{t}\) only appears in \(E_{t;\alpha }(\mathbf {{X}})\),

$$\begin{aligned} \overline{\mathbf {{X}}}_{t}=\mathop {\hbox {argmax}}\limits _{\mathbf {t}\in \mathbb {R}^{n}}\frac{ \mathsf {w}_{\underline{i}}}{\varvec{{X}}_{\underline{i};s}^{1+\alpha }}\left\| \mathbf {{t}}-\varvec{{X}}_{\underline{i};t}\right\| ^{2}, \end{aligned}$$
(128)

and (60) follows.

To find \(\overline{\mathbf {{X}}}_{s}\), we substitute \(\overline{\mathbf {{X}}}_{t}\) back to (6) and remove the rotation term \(\frac{E_{r}(\mathbf {{X}})}{\sigma _{r}^{2}}\), we obtain an optimization problem:

$$\begin{aligned} \overline{\mathbf {{X}}}_{s}=\mathop {\hbox {argmax}}\limits _{s\in \mathbb {R}^{+}}\frac{\mathsf {w}_{ \underline{i}}}{\sigma _{s}^{2}}\mathrm {ln}\left( \frac{s}{\varvec{{X}}_{\underline{i};s}}\right) ^{2}+\frac{\mathsf {w}_{\underline{i}}/\sigma _{t}^{ 2}}{\varvec{{X}}_{\underline{i};s}^{1+\alpha }s^{1-\alpha }}\left\| \overline{\mathbf {{X}}}_{t}-\varvec{{X}}_{\underline{i};t}\right\| ^{2}.\nonumber \\ \end{aligned}$$
(129)

Setting \(z=\mathrm {ln}s\), we obtain the convex objective function (58) which is a sum of a quadratic term and an exponential term, the minimizer of which satisfies a transcendental equation \(Az=\mathrm {e}^{Bz}\) for some constants \(A,B\), therefore cannot be expressed as a closed form. However, any Newton-based approach would sufficiently find the minimizer.

When \(\alpha =1\) the exponential term vanishes, and we get (61).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pham, MT., Woodford, O.J., Perbet, F. et al. Distances and Means of Direct Similarities. Int J Comput Vis 112, 285–306 (2015). https://doi.org/10.1007/s11263-014-0762-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0762-0

Keywords

Navigation