Advertisement

Eigendecomposition-Free Training of Deep Networks with Zero Eigenvalue-Based Losses

  • Zheng DangEmail author
  • Kwang Moo Yi
  • Yinlin Hu
  • Fei Wang
  • Pascal Fua
  • Mathieu Salzmann
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11209)

Abstract

Many classical Computer Vision problems, such as essential matrix computation and pose estimation from 3D to 2D correspondences, can be solved by finding the eigenvector corresponding to the smallest, or zero, eigenvalue of a matrix representing a linear system. Incorporating this in deep learning frameworks would allow us to explicitly encode known notions of geometry, instead of having the network implicitly learn them from data. However, performing eigendecomposition within a network requires the ability to differentiate this operation. While theoretically doable, this introduces numerical instability in the optimization process in practice.

In this paper, we introduce an eigendecomposition-free approach to training a deep network whose loss depends on the eigenvector corresponding to a zero eigenvalue of a matrix predicted by the network. We demonstrate on several tasks, including keypoint matching and 3D pose estimation, that our approach is much more robust than explicit differentiation of the eigendecomposition. It has better convergence properties and yields state-of-the-art results on both tasks.

Keywords

End-to-end learning Eigendecomposition Singular value decomposition Geometric vision 

Notes

Acknowledgements

This research was partially supported by the National Natural Science Foundation of China: Grant 61603291, the program for introducing talents of discipline to university B13043 and the National Science, Technology Major Project: 2018ZX01008103, and by a grant from the Swiss Innovation Agency (CTI/InnoSuisse). This work was performed while Zheng Dang was visiting the CVLab at EPFL.

References

  1. 1.
    Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: USENIX Conference on Operating Systems Design and Implementation, pp. 265–283 (2016)Google Scholar
  2. 2.
    Bian, J., Lin, W., Matsushita, Y., Yeung, S., Nguyen, T., Cheng, M.: GMS: grid-based motion statistics for fast, ultra-robust feature correspondence. In: CVPR (2017)Google Scholar
  3. 3.
    Brachmann, E., et al.: DSAC - differentiable RANSAC for camera localization. ArXiv (2016)Google Scholar
  4. 4.
    Cantzler, H.: RANdom Sample Consensus (RANSAC) (2005). cVonlineGoogle Scholar
  5. 5.
    Crivellaro, A., Rad, M., Verdie, Y., Yi, K.M., Fua, P., Lepetit, V.: Robust 3D object tracking from monocular images using stable parts. PAMI 40, 1465–1479 (2018)CrossRefGoogle Scholar
  6. 6.
    Ferraz, L., Binefa, X., Moreno-noguer, F.: Very fast solution to the PnP problem with algebraic outlier rejection. In: CVPR, pp. 501–508 (2014)Google Scholar
  7. 7.
    Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Garro, V., Crosilla, F., Fusiello, A.: Solving the PnP problem with anisotropic orthogonal procrustes analysis. In: 3DPVT, pp. 262–269 (2012)Google Scholar
  9. 9.
    Giles, M.: Collected matrix derivative results for forward and reverse mode algorithmic differentiation. In: Bischof, C.H., Bücker, H.M., Hovland, P., Naumann, U., Utke, J. (eds.) Advances in Automatic Differentiation. LNCSE, vol. 64, pp. 35–44. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-68942-3_4CrossRefGoogle Scholar
  10. 10.
    Handa, A., Bloesch, M., Pătrăucean, V., Stent, S., McCormac, J., Davison, A.: GVNN: neural network library for geometric computer vision. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 67–82. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49409-8_9CrossRefGoogle Scholar
  11. 11.
    Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)zbMATHGoogle Scholar
  12. 12.
    Hartley, R.: In defense of the eight-point algorithm. PAMI 19(6), 580–593 (1997)CrossRefGoogle Scholar
  13. 13.
    Heinly, J., Schoenberger, J., Dunn, E., Frahm, J.M.: Reconstructing the world in six days. In: CVPR (2015)Google Scholar
  14. 14.
    Huang, G., Liu, Z., Weinberger, K., van der Maaten, L.: Densely connected convolutional networks. In: CVPR (2017)Google Scholar
  15. 15.
    Huang, Z., Wan, C., Probst, T., Gool, L.V.: Deep learning on lie groups for skeleton-based action recognition. In: CVPR, pp. 6099–6108 (2017)Google Scholar
  16. 16.
    Ionescu, C., Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers (2015)Google Scholar
  17. 17.
    Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: NIPS, pp. 2017–2025 (2015)Google Scholar
  18. 18.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimisation. In: ICLR (2015)Google Scholar
  19. 19.
    Kneip, L., Scaramuzza, D., Siegwart, R.: A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. In: CVPR, pp. 2969–2976 (2011)Google Scholar
  20. 20.
    Law, M., Urtasun, R., Zemel, R.S.: Deep spectral clustering learning. In: ICML, pp. 1985–1994 (2017)Google Scholar
  21. 21.
    Lepetit, V., Moreno-noguer, F., Fua, P.: EP\(n\)P: an accurate \(o(n)\) solution to the P\(n\)P problem. IJCV (2009)Google Scholar
  22. 22.
    Li, S., Xu, C., Xie, M.: A robust O(n) solution to the perspective-n-point problem. PAMI 34, 1444–1450 (2012)CrossRefGoogle Scholar
  23. 23.
    Longuet-Higgins, H.: A computer algorithm for reconstructing a scene from two projections. Nature 293, 133–135 (1981)CrossRefGoogle Scholar
  24. 24.
    Murray, I.: Differentiation of the Cholesky decomposition. arXiv Preprint (2016)Google Scholar
  25. 25.
    Nister, D.: An efficient solution to the five-point relative pose problem. In: CVPR, June 2003Google Scholar
  26. 26.
    Papadopoulo, T., Lourakis, M.I.A.: Estimating the Jacobian of the singular value decomposition: theory and applications. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 554–570. Springer, Heidelberg (2000).  https://doi.org/10.1007/3-540-45054-8_36CrossRefGoogle Scholar
  27. 27.
    Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS Autodiff Workshop (2017)Google Scholar
  28. 28.
    Raguram, R., Chum, O., Pollefeys, M., Matas, J., Frahm, J.M.: USAC: a universal framework for random sample consensus. PAMI 35(8), 2022–2038 (2013)CrossRefGoogle Scholar
  29. 29.
    Rousseeuw, P., Leroy, A.: Robust Regression and Outlier Detection. Wiley, New York (1987)CrossRefGoogle Scholar
  30. 30.
    Schönemann, P.: A generalized solution of the orthogonal procrustes problem. Psychometrika 31(1), 1–10 (1966)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Simpson, D.: Introduction to Rousseeuw (1984) least median of squares regression. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics. SSS, pp. 433–461. Springer, New York (1997).  https://doi.org/10.1007/978-1-4612-0667-5_18CrossRefGoogle Scholar
  32. 32.
    Strecha, C., Hansen, W., Van Gool, L., Fua, P., Thoennessen, U.: On benchmarking camera calibration and multi-view stereo for high resolution imagery. In: CVPR (2008)Google Scholar
  33. 33.
    Torr, P., Zisserman, A.: MLESAC: a new robust estimator with application to estimating image geometry. CVIU 78, 138–156 (2000)Google Scholar
  34. 34.
    Ummenhofer, B., et al.: DeMoN: depth and motion network for learning monocular stereo. In: CVPR (2017)Google Scholar
  35. 35.
    Wu, C.: Towards linear-time incremental structure from motion. In: 3DV (2013)Google Scholar
  36. 36.
    Xiao, J., Owens, A., Torralba, A.: SUN3D: a database of big spaces reconstructed using SFM and object labels. In: ICCV (2013)Google Scholar
  37. 37.
    Yi, K.M., Trulls, E., Ono, Y., Lepetit, V., Salzmann, M., Fua, P.: Learning to find good correspondences. In: CVPR (2018)Google Scholar
  38. 38.
    Zamir, A.R., Wekel, T., Agrawal, P., Wei, C., Malik, J., Savarese, S.: Generic 3D representation via pose estimation and matching. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 535–553. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_33CrossRefGoogle Scholar
  39. 39.
    Zhang, Z.: Determining the epipolar geometry and its uncertainty: a review. IJCV 27(2), 161–195 (1998)CrossRefGoogle Scholar
  40. 40.
    Zheng, Y., Kuang, Y., Sugimoto, S., Astrom, K., Okutomi, M.: Revisiting the PnP problem: a fast, general and optimal solution. In: ICCV (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.National Engineering Laboratory for Visual Information Processing and ApplicationXi’an Jiaotong UniversityXi’anChina
  2. 2.School of Electronic and Information EngineeringXi’an Jiaotong UniversityXi’anChina
  3. 3.Visual Computing GroupUniversity of VictoriaVictoriaCanada
  4. 4.CVLabEPFLLausanneSwitzerland

Personalised recommendations