Advertisement

Online Invariance Selection for Local Feature Descriptors

Conference paper
  • 887 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12347)

Abstract

To be invariant, or not to be invariant: that is the question formulated in this work about local descriptors. A limitation of current feature descriptors is the trade-off between generalization and discriminative power: more invariance means less informative descriptors. We propose to overcome this limitation with a disentanglement of invariance in local descriptors and with an online selection of the most appropriate invariance given the context. Our framework (https://github.com/rpautrat/LISRD) consists in a joint learning of multiple local descriptors with different levels of invariance and of meta descriptors encoding the regional variations of an image. The similarity of these meta descriptors across images is used to select the right invariance when matching the local descriptors. Our approach, named Local Invariance Selection at Runtime for Descriptors (LISRD), enables descriptors to adapt to adverse changes in images, while remaining discriminative when invariance is not required. We demonstrate that our method can boost the performance of current descriptors and outperforms state-of-the-art descriptors in several matching tasks, when evaluated on challenging datasets with day-night illumination as well as viewpoint changes.

Keywords

Local descriptors Invariance Visual localization 

Notes

Acknowledgments

This work has been supported by an ETH Zurich Postdoctoral Fellowship and Innosuisse funding (Grant No. 34475.1 IP-ICT).

Supplementary material

504434_1_En_42_MOESM1_ESM.zip (26.3 mb)
Supplementary material 1 (zip 26898 KB)

References

  1. 1.
    Arandjelović, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  2. 2.
    Balntas, V., Tang, L., Mikolajczyk, K.: Bold - binary online learned descriptor for efficient image matching. In: Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  3. 3.
    Balntas, V., Johns, E., Tang, L., Mikolajczyk, K.: PN-Net: conjoined triple deep network for learning local image descriptors. In: Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  4. 4.
    Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: Hpatches: a benchmark and evaluation of handcrafted and learned local descriptors. In: Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  5. 5.
    Balntas, V., Riba, E., Ponsa, D., Mikolajczyk, K.: Learning local feature descriptors with triplets and shallow convolutional neural networks. In: British Machine Vision Conference (BMVC) (2016)Google Scholar
  6. 6.
    Brown, M., Hua, G., Winder, S.: Discriminative learning of local image descriptors. Trans. Pattern Anal. Mach. Intell. (PAMI) (2010)Google Scholar
  7. 7.
    Cohen, T., Welling, M.: Group equivariant convolutional networks. In: International Conference on Machine Learning (ICML) (2016)Google Scholar
  8. 8.
    DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: Computer Vision and Pattern Recognition Workshops (CVPRW) (2018)Google Scholar
  9. 9.
    Dusmanu, M., et al.: D2-Net: a trainable CNN for joint detection and description of local features. In: Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  10. 10.
    Ebel, P., Mishchuk, A., Yi, K.M., Fua, P., Trulls, E.: Beyond Cartesian representations for local descriptors. In: International Conference on Computer Vision (ICCV) (2019)Google Scholar
  11. 11.
    Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.: Matchnet: unifying feature and metric learning for patch-based matching. In: Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  12. 12.
    Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of Fourth Alvey Vision Conference (1988)Google Scholar
  13. 13.
    He, K., Lu, Y., Sclaroff, S.: Local descriptors optimized for average precision. In: Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  14. 14.
    Heinly, J., Schönberger, J.L., Dunn, E., Frahm, J.M.: Reconstructing the world* in six days *(as captured by the Yahoo 100 million image dataset). In: Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  15. 15.
    Kaliroff, D., Gilboa, G.: Self-supervised unconstrained illumination invariant representation. In: arXiv (2019)Google Scholar
  16. 16.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization (2014)Google Scholar
  17. 17.
    Li, Z., Snavely, N.: Megadepth: learning single-view depth prediction from internet photos. In: Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  18. 18.
    Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  19. 19.
    Liu, Y., Shen, Z., Lin, Z., Peng, S., Bao, H., Zhou, X.: GIFT: learning transformation-invariant dense visual descriptors via group CNNs. In: Advances in Neural Information Processing Systems (NeurIPS) (2019)Google Scholar
  20. 20.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60 (2004)Google Scholar
  21. 21.
    Luo, Z., et al.: Contextdesc: local descriptor augmentation with cross-modality context. In: Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  22. 22.
    Luo, Z., Shen, T., Zhou, L., Zhu, S., Zhang, R., Yao, Y., Fang, T., Quan, L.: GeoDesc: learning local descriptors by integrating geometry constraints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 170–185. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01240-3_11CrossRefGoogle Scholar
  23. 23.
    Mikolajczyk, K., et al.: A comparison of affine region detectors. Int. J. Comput. Vis. (IJCV) (2005)Google Scholar
  24. 24.
    Mishchuk, A., Mishkin, D., Radenović, F., Matas, J.: Working hard to know your neighbor’s margins: local descriptor learning loss. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  25. 25.
    Mishkin, D., Radenović, F., Matas, J.: Repeatability is not enough: learning affine regions via discriminability. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 287–304. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01240-3_18CrossRefGoogle Scholar
  26. 26.
    Mitra, R., et al.: A Large Dataset for Improving Patch Matching. arXiv (2018)Google Scholar
  27. 27.
    Murmann, L., Gharbi, M., Aittala, M., Durand, F.: A multi-illumination dataset of indoor object appearance. In: International Conference on Computer Vision (ICCV) (2019)Google Scholar
  28. 28.
    Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Large-scale image retrieval with attentive deep local features. In: International Conference on Computer Vision (ICCV) (2017)Google Scholar
  29. 29.
    Ono, Y., Trulls, E., Fua, P., Yi, K.M.: LF-Net: learning local features from images. In: Advances in Neural Information Processing Systems (NIPS) (2018)Google Scholar
  30. 30.
    Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  31. 31.
    Revaud, J., Weinzaepfel, P., de Souza, C.R., Humenberger, M.: R2D2: repeatable and reliable detector and descriptor. In: Advances in Neural Information Processing Systems (NeurIPS) (2019)Google Scholar
  32. 32.
    Sarlin, P.E., Cadena, C., Siegwart, R., Dymczyk, M.: From coarse to fine: robust hierarchical localization at large scale. In: Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  33. 33.
    Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: Computer Vision and Pattern Recognition (CVPR) (2020)Google Scholar
  34. 34.
    Sattler, T., et al.: Benchmarking 6DOF outdoor visual localization in changing conditions. In: Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  35. 35.
    Sattler, T., et al.: Are large-scale 3D models really necessary for accurate visual localization? In: Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  36. 36.
    Schönberger, J.L., Pollefeys, M., Geiger, A., Sattler, T.: Semantic visual localization. In: Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  37. 37.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2014)Google Scholar
  38. 38.
    Tian, Y., Fan, B., Wu, F.: L2-Net: Deep learning of discriminative patch descriptor in Euclidean space. In: Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  39. 39.
    Tian, Y., Yu, X., Fan, B., Wu, F., Heijnen, H., Balntas, V.: SOSNET: second order similarity regularization for local descriptor learning. In: Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  40. 40.
    Tola, E., Lepetit, V., Fua, P.: Daisy: an efficient dense descriptor applied to wide baseline stereo. Trans. Pattern Anal. Mach. Intell. (PAMI) 32 (2010)Google Scholar
  41. 41.
    Wu, C., Li, X., Frahm, J., Pollefeys, M.: 3D model matching with viewpoint-invariant patches (VIP). In: Computer Vision and Pattern Recognition (CVPR) (2008)Google Scholar
  42. 42.
    Yang, T.Y., Nguyen, D.K., Heijnen, H., Balntas, V.: Ur2kid: Unifying retrieval, keypoint detection, and keypoint description without local correspondence supervision. In: arXiv (2020)Google Scholar
  43. 43.
    Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: LIFT: learned invariant feature transform. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 467–483. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_28CrossRefGoogle Scholar
  44. 44.
    Zhou, H., Sattler, T., Jacobs, D.W.: Evaluating local features for day-night matching. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 724–736. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49409-8_60CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer ScienceETH ZurichZurichSwitzerland
  2. 2.Microsoft MR & AIZurichSwitzerland

Personalised recommendations