International Journal of Computer Vision

, Volume 112, Issue 3, pp 285–306 | Cite as

Distances and Means of Direct Similarities

  • Minh-Tri Pham
  • Oliver J. Woodford
  • Frank Perbet
  • Atsuto Maki
  • Riccardo Gherardi
  • Björn Stenger
  • Roberto Cipolla


The non-Euclidean nature of direct isometries in a Euclidean space, i.e. transformations consisting of a rotation and a translation, creates difficulties when computing distances, means and distributions over them, which have been well studied in the literature. Direct similarities, transformations consisting of a direct isometry and a positive uniform scaling, present even more of a challenge—one which we demonstrate and address here. In this article, we investigate divergences (a superset of distances without constraints on symmetry and sub-additivity) for comparing direct similarities, and means induced by them via minimizing a sum of squared divergences. We analyze several standard divergences: the Euclidean distance using the matrix representation of direct similarities, a divergence from Lie group theory, and the family of all left-invariant distances derived from Riemannian geometry. We derive their properties and those of their induced means, highlighting several shortcomings. In addition, we introduce a novel family of left-invariant divergences, called SRT divergences, which resolve several issues associated with the standard divergences. In our evaluation we empirically demonstrate the derived properties of the divergences and means, both qualitatively and quantitatively, on synthetic data. Finally, we compare the divergences in a real-world application: vote-based, scale-invariant object recognition. Our results show that the new divergences presented here, and their means, are both more effective and faster to compute for this task.


Direct similarity Distance Mean  Registration Object recognition 



We are thankful to Peter Meer, Department of Electrical and Computer Engineering, Rudgers University for his valuable feedback on the analysis of Riemannian distances of the article, and to all three reviewers for their insightful comments and suggestions that helped us to improve the article.


  1. Agrawal, M. (2006). A Lie algebraic approach for consistent pose registration for general euclidean motion. In: Proceedings of the international conference on intelligent Robot and systems (pp. 1891–1897).Google Scholar
  2. Arnaudon, M., & Miclo, L. (2014). Means in complete manifolds: Uniqueness and approximation. ESAIM: Probability and Statistics, 18, 185–206.CrossRefMATHMathSciNetGoogle Scholar
  3. Arnold, V., Vogtmann, K., & Weinstein, A. (1989). Mathematical methods of classical mechanics. Graduate Texts in Mathematics. Springer.Google Scholar
  4. Arsigny, V., Commowick, O., Pennec, X., & Ayache, N. (2006a). A Log-Euclidean polyaffine framework for locally rigid or affine registration. In: Biomedical image registration (Vol. 4057, pp 120–127).Google Scholar
  5. Arsigny, V., Pennec, X., & Ayache, N. (2006b). Bi-invariant means in lie groups. Applications to left-invariant polyaffine transformations. Tech. rep., INRIA Technical Report No. 5885.Google Scholar
  6. Begelfor, E., & Werman, M. (2006). Affine invariance revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 2087–2094), Washington, DC, USA: IEEE Computer Society.Google Scholar
  7. Beltrami, E. (1868). Teoria fondamentale degli spazi di curvatura constante. Annali di Mat, II(2), 232–255.Google Scholar
  8. Bhattacharya, R., & Patrangenaru, V. (2003). Large sample theory of intrinsic and extrinsic sample means on manifolds. The Annals of Statistics, 31(1), 1–29.CrossRefMATHMathSciNetGoogle Scholar
  9. Bossa, M. N., & Olmos, S. (2006). Statistical model of similarity transformations: Building a multi-object pose model of brain structures. In: Workshop on mathematical methods in biomedical image analysis.Google Scholar
  10. Carreira Perpinan, M. (2007). Gaussian mean-shift is an EM algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5), 767–776.CrossRefGoogle Scholar
  11. Cetingul, H., & Vidal, R. (2009). Intrinsic mean shift for clustering on Stiefel and Grassmann manifolds. In: Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1896–1902).Google Scholar
  12. Cheng, S. H., Higham, N. J., Kenney, C. S., & Laub, A. J. (2000). Approximating the logarithm of a matrix to specified accuracy. SIAM Journal on Matrix Analysis and Applications, 22, 1112–1125.CrossRefMathSciNetGoogle Scholar
  13. Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 790–799.CrossRefGoogle Scholar
  14. Coxeter, H. S. M. (1961). Introduction to geometry. New York: Wiley.MATHGoogle Scholar
  15. Downs, T. (1972). Orientation statistics. Biometrika, 59, 665–676.CrossRefMATHMathSciNetGoogle Scholar
  16. Drost, B., Ulrich, M., Navab, N., & Ilic, S. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition (pp. 998–1005).Google Scholar
  17. Dubbelman, G., Dorst, L., & Pijls, H. (2012). Manifold statistics for essential matrices. In: Proceedings of European conference on computer vision (pp. 531–544). Lecture Notes in Computer Science. Berlin: Springer.Google Scholar
  18. Eade, E. (2011). Lie groups for 2d and 3d transformations,, revised Dec. 2013.
  19. Fréchet, M. (1948). Les lments alatoires de nature quelconque dans un espace distanci. Annales de l’Institut Henri Poincare, 10, 215–310.Google Scholar
  20. Gallier, J., & Xu, D. (2002). Computing exponentials of Skew-Symmetric matrices and logarithms of orthogonal matrices. International Journal of Robotics and Automation, 17(4), 10–20.Google Scholar
  21. Hall, B. C. (2003). Lie groups, lie algebras, and representations: An elementary introduction. Berlin: Springer.CrossRefGoogle Scholar
  22. Hartley, R., Trumpf, J., Dai, Y., & Li, H. (2013). Rotation averaging. International Journal of Computer Vision, 103(3), 267–305.Google Scholar
  23. Hartley, R. I., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.CrossRefMATHGoogle Scholar
  24. Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14, 1771–1800.CrossRefMATHGoogle Scholar
  25. Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Communications on Pure and Applied Mathematics, 30(5), 509–541.CrossRefMATHMathSciNetGoogle Scholar
  26. Khoshelham, K. (2007). Extending generalized Hough transform to detect 3D objects in laser range data. Workshop on Laser Scanning, XXXVI, 206–210.Google Scholar
  27. Knopp, J., Prasad, M., Willems, G., Timofte, R., & Van Gool, L. (2010). Hough transform and 3D SURF for robust three dimensional classification. In: Proceedings of European conference on computer vision (pp. 589–602).Google Scholar
  28. Lee, J. (1997). Riemannian manifolds: An introduction to curvature. Graduate Texts in Mathematics. Springer.Google Scholar
  29. Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.CrossRefGoogle Scholar
  30. Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45, 1989.CrossRefMathSciNetGoogle Scholar
  31. Moakher, M. (2002). Means and averaging in the group of rotations. SIAM Journal on Matrix Analysis and Applications, 24, 1–16.CrossRefMATHMathSciNetGoogle Scholar
  32. O’Neill, B. (1983). Semi-Riemannian geometry: With applications to relativity. No. v. 103 in pure and applied mathematics. Academic Press.Google Scholar
  33. Opelt, A., Pinz, A., & Zisserman, A. (2008). Learning an alphabet of shape and appearance for multi-class object detection. International Journal of Computer Vision, 80(1).Google Scholar
  34. Park, F. C. (1995). Distance metrics on the rigid-body motions with applications to mechanism design. Journal of Mechanical Design, 117(1), 48–54.CrossRefGoogle Scholar
  35. Park, F. C., & Ravani, B. (1997). Smooth invariant interpolation of rotations. ACM Transactions on Graphics, 16(3), 277–295.CrossRefGoogle Scholar
  36. Parzen, E. (1962). 1962. The Annals of Mathematical Statistics, 33(3), 1065–1076.Google Scholar
  37. Pelletier, B. (2005). Kernel density estimation on Riemannian manifolds. Statistics Probability Letters, 73(3), 297–304.CrossRefMATHMathSciNetGoogle Scholar
  38. Pennec, X. (1998). Computing the mean of geometric features application to the mean rotation. Tech. Rep. RR-3371, INRIA.Google Scholar
  39. Pennec, X. (2006). Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. JMIV, 25(1), 127–154.CrossRefMathSciNetGoogle Scholar
  40. Pennec, X., & Ayache, N. (1998). Uniform distribution, distance and expectation problems for geometric features processing. Journal of Mathematical Imaging and Vision, 9, 49–67.CrossRefMATHMathSciNetGoogle Scholar
  41. Pennec, X., & Thirion, J. P. (1997). A framework for uncertainty and validation of 3D registration methods based on points and frames. International Journal of Computer Vision, 25(3), 203–229.CrossRefGoogle Scholar
  42. Pham, M. T., Woodford, O. J., Perbet, F., Maki, A., Stenger, B., & Cipolla, R. (2011). A new distance for scale-invariant 3D shape recognition and registration. In: Proceedings of the international conference on computer vision.Google Scholar
  43. Pham, M. T., Woodford, O. J., Perbet, F., Maki, A., & Stenger, B. (2012). Toshiba CAD model point clouds dataset.
  44. Poincaré, H. (1882). Théorie des groupes fuchsiens. Almqvist & Wiksells.Google Scholar
  45. Ravani, B., & Roth, B. (1983). Motion synthesis using kinematic mappings. Journal of Mechanical Design, 105(3), 460–467.Google Scholar
  46. Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics, 27(3), 832–837.CrossRefMATHMathSciNetGoogle Scholar
  47. Schönemann, P. (1966). A generalized solution of the orthogonal procrustes problem. Psychometrika, 31(1), 1–10.CrossRefMATHMathSciNetGoogle Scholar
  48. Schramm, E., & Schreck, P. (2003). Solving geometric constraints invariant modulo the similarity group. In: International conference on computational science and its applications (pp. 356–365).Google Scholar
  49. Shotton, J., Blake, A., & Cipolla, R. (2008). Multiscale categorical object recognition using contour fragments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7), 1270–1281.CrossRefGoogle Scholar
  50. Sibson, R. (1979). Studies in the robustness of multidimensional scaling: Perturbational analysis of classical scaling. Journal of the Royal Statistical Society Series B, 41(2), 217–229.MATHMathSciNetGoogle Scholar
  51. Sternberg, S. (1999). Lectures on differential geometry. AMS Chelsea Publishing Series. Chelsea Publishing CompanyGoogle Scholar
  52. Strasdat, H., Montiel, J., & Davison, A. J. (2010). Scale drift-aware large scale monocular slam. Robotics: Science and Systems., 2(3), 5.Google Scholar
  53. Subbarao, R., & Meer, P. (2009). Nonlinear mean shift over Riemannian manifolds. International Journal of Computer Vision, 84(1).Google Scholar
  54. Tombari, F., & Di Stefano, L. (2010). Object recognition in 3D scenes with occlusions and clutter by Hough voting. In: Proceedings of Pacific-Rim symposium on image and video technology (pp. 349–355).Google Scholar
  55. Vaccaro, C. (2012). Heat kernel methods in finance: The SABR model. Quantitative Finance Papers.
  56. Woodford, O. J., Pham, M. T., Maki, A., Perbet, F., & Stenger, B. (2013). Demisting the Hough transform for 3D shape recognition and registration. In: International Journal of Computer Vision.Google Scholar
  57. Zefran, M., & Kumar, V. (1998). Interpolation schemes for rigid body motions. Computer-Aided Design, 30(3), 179–189.CrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Minh-Tri Pham
    • 1
  • Oliver J. Woodford
    • 1
  • Frank Perbet
    • 1
  • Atsuto Maki
    • 2
  • Riccardo Gherardi
    • 1
  • Björn Stenger
    • 1
  • Roberto Cipolla
    • 3
  1. 1.Toshiba Research Europe Ltd.CambridgeUK
  2. 2.Computer Vision and Active Perception Laboratory, School of Computer Science and CommunicationKTH Royal Institute of TechnologyStockholmSweden
  3. 3.Engineering DepartmentUniversity of CambridgeCambridgeUK

Personalised recommendations