Abstract
We address the problem of camera pose estimation in visual localization. Current regression-based methods for pose estimation are trained and evaluated scene-wise. They depend on the coordinate frame of the training dataset and show a low generalization across scenes and datasets. We identify the dataset shift an important barrier to generalization and consider transfer learning as an alternative way towards a better reuse of pose estimation models. We revise domain adaptation techniques for classification and extend them to camera pose estimation, which is a multi-regression task. We develop a deep adaptation network for learning scene-invariant image representations and use adversarial learning to generate such representations for model transfer. We enrich the network with self-supervised learning and use the adaptability theory to validate the existence of scene-invariant representation of images in two given scenes. We evaluate our network on two public datasets, Cambridge Landmarks and 7Scene, demonstrate its superiority over several baselines and compare to the state of the art methods.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We show 3-DoF pose positions only; pose orientations show the same phenomenon.
References
Balntas, V., Li, S., Prisacariu, V.: Relocnet: continuous metric learning relocalisation using neural nets. In: European Conference Computer Vision (ECCV), pp. 782–799 (2018)
Borchani, H., Varando, G., Bielza, C., Larrañaga, P.: A survey on multi-output regression. Wiley Int. Rev. Data Min. Knowl. Disc. 5(5), 216–233 (2015)
Brachmann, E., et al.: DSAC - differentiable RANSAC for camera localization. In: Computer Vision Pattern Recognition (CVPR), pp. 2492–2500 (2017)
Brachmann, E., Rother, C.: Learning less is more - 6D camera localization via 3D surface regression. In: Computer Vision Pattern Recognition (CVPR), pp. 4654–4662 (2018)
Brahmbhatt, S., Gu, J., Kim, K., Hays, J., Kautz, J.: Geometry-aware learning of maps for camera localization. In: Computer Vision Pattern Recognition (CVPR), pp. 2616–2625 (2018)
Bui, M., Baur, C., Navab, N., Ilic, S., Albarqouni, S.: Adversarial networks for camera pose regression and refinement. In: ICCV Workshops, vol. 2019, pp. 3778–3787 (2019)
Cao, Z., Ma, L., Long, M., Wang, J.: Partial adversarial domain adaptation. In: European Conference Computer Vision (ECCV), pp. 139–155 (2018)
Cao, Z., You, K., Long, M., Wang, J., Yang, Q.: Learning to transfer examples for partial domain adaptation. In: Computer Vision Pattern Recognition (CVPR), pp. 2985–2994 (2019)
Chen, X., Monfort, M., Liu, A., Ziebart, B.D.: Robust covariate shift regression. Proc. AISTATS. 51, 1270–1279 (2016)
Chen, X., Wang, S., Long, M., Wang, J.: Transferability vs. discriminability: batch spectral penalization for adversarial domain adaptation. In: International Conference on Machine Learning (ICML), vol. 97, pp. 1081–1090 (2019)
Cortes, C., Mohri, M.: Domain adaptation in regression. In: Proceedings 22nd International Conference on Algorithmic Learning Theory (2011)
Ganin, Y., et al.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17, 59:1–59:35 (2016)
Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. In: International Conference on Learning Representation (ICLR) (2018)
Hoffman, J., Rodner, E., Donahue, J., Darrell, T., Saenko, K.: Efficient learning of domain-invariant image representations. CoRR arXiv:1301.3224 (2013)
Kendall, A., Cipolla, R.: Modelling uncertainty in deep learning for camera relocalization. In: IEEE International Conference on Robotics and Automation, ICRA, pp. 4762–4769 (2016)
Kendall, A., Cipolla, R.: Geometric loss functions for camera pose regression with deep learning. In: Computer Vision Pattern Recognition (CVPR), pp. 6555–6564 (2017)
Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: International Conference on Computer Vision (ICCV), pp. 2938–2946 (2015)
Kolesnikov, A., Zhai, X., Beyer, L.: Revisiting self-supervised visual representation learning. In: Computer Vision Pattern Recognition (CVPR), pp. 1920–1929 (2019)
Laskar, Z., Melekhov, I., Kalia, S., Kannala, J.: Camera relocalization by computing pairwise relative poses using convolutional neural network. In: IEEE International Conference on Computer Vision Workshops, pp. 929–938 (2017)
Lathuilière, S., Mesejo, P., Alameda-Pineda, X., Horaud, R.: A comprehensive analysis of deep regression. CoRR 1803.08450 (2018)
Leng, C., Zhang, H., Li, B., Cai, G., Pei, Z., He, L.: Local feature descriptor for image matching: a survey. IEEE Access 7, 6424–6434 (2019)
Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: International Conference on Machine Learning (ICML), pp. 97–105 (2015)
Long, M., Wang, J., Ding, G., Sun, J., Yu, P.S.: Transfer joint matching for unsupervised domain adaptation. In: Computer Vision Pattern Recognition (CVPR), pp. 1410–1417 (2014)
Melekhov, I., Ylioinas, J., Kannala, J., Rahtu, E.: Relative Camera Pose Estimation Using Convolutional Neural Networks. CoRR 1702.01381 (2017)
Radwan, N., Valada, A., Burgard, W.: Vlocnet++: deep multitask learning for semantic visual localization and odometry. IEEE Rob. Autom. Lett. 3(4), 4407–4414 (2018)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to SIFT or SURF. In: International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011)
Saha, S., Varma, G., Jawahar, C.V.: Improved visual relocalization by discovering anchor points. In: British Machine Computer Vision (BMVC), p. 164 (2018)
Saito, K., Yamamoto, S., Ushiku, Y., Harada, T.: Open set domain adaptation by backpropagation. In: European Conference Computer Vision (ECCV), pp. 153–168 (2018)
Sattler, T., Zhou, Q., Pollefeys, M., Leal-Taixe, L.: Understanding the limitations of CNN-based absolute camera pose regression. In: Computer Vision Pattern Recognition (CVPR), pp. 3302–3312 (2019)
Shavit, Y., Ferens, R.: Introduction to Camera Pose Estimation with Deep Learning. CoRR 1907.05272 (2019)
Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., Fitzgibbon, A.W.: Scene coordinate regression forests for camera relocalization in RGB-D images. In: Computer Vision Pattern Recognition (CVPR), pp. 2930–2937 (2013)
Torralba, A., Efros, A.A.: Unbiased look at dataset bias. In: Computer Vision Pattern Recognition (CVPR), pp. 1521–1528 (2011)
Valada, A., Radwan, N., Burgard, W.: Deep auxiliary learning for visual localization and odometry. In: IEEE International Conference on Robotics and Automation, ICRA, pp. 6939–6946 (2018)
Walch, F., Hazirbas, C., Leal-Taixé, L., Sattler, T., Hilsenbeck, S., Cremers, D.: Image-based localization using LSTMs for structured feature correlation. In: International Conference on Computer Vision (ICCV), pp. 627–637 (2017)
Wang, M., Deng, W.: Deep visual domain adaptation: a survey. Neurocomputing 312, 135–153 (2018)
Yang, L., Bai, Z., Tang, C., Li, H., Furukawa, Y., Tan, P.: Sanet: scene agnostic network for camera localization. In: European Conference Computer Vision (ECCV), pp. 42–51 (2019)
You, K., Long, M., Cao, Z., Wang, J., Jordan, M.I.: Universal domain adaptation. In: Computer Vision Pattern Recognition (CVPR), pp. 2720–2729 (2019)
Zhang, Y., David, P., Gong, B.: Curriculum domain adaptation for semantic segmentation of urban scenes. In: International Conference on Computer Vision (ICCV), pp. 2039–2049 (2017)
Zhou, Q., Sattler, T., Pollefeys, M., Leal-Taixe, L.: To learn or not to learn: Visual localization from essential matrices. CoRR abs/1908.01293 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Chidlovskii, B., Sadek, A. (2020). Adversarial Transfer of Pose Estimation Regression. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12535. Springer, Cham. https://doi.org/10.1007/978-3-030-66415-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-66415-2_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66414-5
Online ISBN: 978-3-030-66415-2
eBook Packages: Computer ScienceComputer Science (R0)