Advertisement

One-Shot Unsupervised Cross-Domain Detection

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12361)

Abstract

Despite impressive progress in object detection over the last years, it is still an open challenge to reliably detect objects across visual domains. All current approaches access a sizable amount of target data at training time. This is a heavy assumption, as often it is not possible to anticipate the domain where a detector will be used, nor to access it in advance for data acquisition. Consider for instance the task of monitoring image feeds from social media: as every image is uploaded by a different user it belongs to a different target domain that is impossible to foresee during training. Our work addresses this setting, presenting an object detection algorithm able to perform unsupervised adaptation across domains by using only one target sample, seen at test time. We introduce a multi-task architecture that one-shot adapts to any incoming sample by iteratively solving a self-supervised task on it. We further enhance this auxiliary adaptation with cross-task pseudo-labeling. A thorough benchmark analysis against the most recent cross-domain detection methods and a detailed ablation study show the advantage of our approach.

Keywords

Object detection Cross-domain analysis Self-supervision 

Notes

Acknowledgements

This work was partially founded by the ERC grant 637076 RoboExNovo (AD, FCB, SB, BC) and took advantage of the GPU donated by NVIDIA (Academic Hardware Grant, TT). We acknowledge the support provided by Tomer Cohen and Kim Taekyung on their code respectively of BiOST and DivMatch.

Supplementary material

504471_1_En_43_MOESM1_ESM.pdf (791 kb)
Supplementary material 1 (pdf 790 KB)

References

  1. 1.
    Asano, Y.M., Rupprecht, C., Vedaldi, A.: A critical analysis of self-supervision, or what we can learn from a single image. In: ICLR (2020)Google Scholar
  2. 2.
    Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.: A theory of learning from different domains. Mach. Learn. 79, 151–175 (2010).  https://doi.org/10.1007/s10994-009-5152-4MathSciNetCrossRefGoogle Scholar
  3. 3.
    Benaim, S., Wolf, L.: One-shot unsupervised cross domain translation. In: NIPS (2018)Google Scholar
  4. 4.
    Bucci, S., D’Innocente, A., Tommasi, T.: Tackling partial domain adaptation with self-supervision. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11752, pp. 70–81. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-30645-8_7CrossRefGoogle Scholar
  5. 5.
    Cai, Q., Pan, Y., Ngo, C.W., Tian, X., Duan, L., Yao, T.: Exploring object relation in mean teacher for cross-domain detection. In: CVPR (2019)Google Scholar
  6. 6.
    Carlucci, F.M., D’Innocente, A., Bucci, S., Caputo, B., Tommasi, T.: Domain generalization by solving jigsaw puzzles. In: CVPR (2019)Google Scholar
  7. 7.
    Chen, Y., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Domain adaptive faster R-CNN for object detection in the wild. In: CVPR (2018)Google Scholar
  8. 8.
    Cohen, T., Wolf, L.: Bidirectional one-shot unsupervised domain mapping. In: ICCV (2019)Google Scholar
  9. 9.
    Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)Google Scholar
  10. 10.
    Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS (2016)Google Scholar
  11. 11.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)Google Scholar
  12. 12.
    Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: ICCV (2015)Google Scholar
  13. 13.
    Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010).  https://doi.org/10.1007/s11263-009-0275-4CrossRefGoogle Scholar
  14. 14.
    Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: ICML (2015)Google Scholar
  15. 15.
    Ganin, Y., et al.: Domain-adversarial training of neural networks. JMLR 17(1), 2030–2096 (2016) MathSciNetGoogle Scholar
  16. 16.
    Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)CrossRefGoogle Scholar
  17. 17.
    Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. In: ICLR (2018)Google Scholar
  18. 18.
    Girshick, R.: Fast R-CNN. In: ICCV (2015)Google Scholar
  19. 19.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)Google Scholar
  20. 20.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)Google Scholar
  21. 21.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  22. 22.
    Hoffman, J., Darrell, T., Saenko, K.: Continuous manifold based adaptation for evolving visual domains. In: CVPR (2014)Google Scholar
  23. 23.
    Hoffman, J., et al.: CyCADA: cycle-consistent adversarial domain adaptation. In: ICML (2018)Google Scholar
  24. 24.
    Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 340–353. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33712-3_25CrossRefGoogle Scholar
  25. 25.
    Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. In: CVPR (2018)Google Scholar
  26. 26.
    Khodabandeh, M., Vahdat, A., Ranjbar, M., Macready, W.G.: A robust learning approach to domain adaptive object detection. In: ICCV (2019)Google Scholar
  27. 27.
    Kim, S., Choi, J., Kim, T., Kim, C.: Self-training and adversarial background regularization for unsupervised domain adaptive one-stage object detection. In: ICCV (2019)Google Scholar
  28. 28.
    Kim, T., Jeong, M., Kim, S., Choi, S., Kim, C.: Diversify and match: a domain adaptive representation learning paradigm for object detection. In: CVPR (2019)Google Scholar
  29. 29.
    Larsson, G., Maire, M., Shakhnarovich, G.: Colorization as a proxy task for visual understanding. In: CVPR (2017)Google Scholar
  30. 30.
    Liu, S., Huang, D., Wang, Y.: Receptive field block net for accurate and fast object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 404–419. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01252-6_24CrossRefGoogle Scholar
  31. 31.
    Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  32. 32.
    Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: ICML (2015)Google Scholar
  33. 33.
    Long, M., Zhu, H., Wang, J., Jordan, M.I.: Deep transfer learning with joint adaptation networks. In: ICML (2017)Google Scholar
  34. 34.
    Mancini, M., Karaoguz, H., Ricci, E., Jensfelt, P., Caputo, B.: Kitting in the wild through online domain adaptation. In: IROS (2018)Google Scholar
  35. 35.
    Massa, F., Girshick, R.: maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch. https://github.com/facebookresearch/maskrcnn-benchmark (2018). Accessed 22 Aug 2019
  36. 36.
    Motiian, S., Jones, Q., Iranmanesh, S., Doretto, G.: Few-shot adversarial domain adaptation. In: NIPS (2017)Google Scholar
  37. 37.
    Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 69–84. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_5CrossRefGoogle Scholar
  38. 38.
    Pathak, D., Krähenbühl, P., Donahue, J., Darrell, T., Efros, A.: Context encoders: feature learning by inpainting. In: CVPR (2016)Google Scholar
  39. 39.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)Google Scholar
  40. 40.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)Google Scholar
  41. 41.
    Russo, P., Carlucci, F.M., Tommasi, T., Caputo, B.: From source to target and back: symmetric bi-directional adaptive GAN. In: CVPR (2018)Google Scholar
  42. 42.
    Saito, K., Ushiku, Y., Harada, T., Saenko, K.: Strong-weak distribution alignment for adaptive object detection. In: CVPR (2019)Google Scholar
  43. 43.
    Sakaridis, C., Dai, D., Van Gool, L.: Semantic foggy scene understanding with synthetic data. IJCV 126(9), 973–992 (2018).  https://doi.org/10.1007/s11263-018-1072-8CrossRefGoogle Scholar
  44. 44.
    Sun, B., Saenko, K.: Deep CORAL: correlation alignment for deep domain adaptation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 443–450. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49409-8_35CrossRefGoogle Scholar
  45. 45.
    Tobin, J., Fong, R.H., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: IROS (2017)Google Scholar
  46. 46.
    Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Adversarial discriminative domain adaptation. In: CVPR (2017)Google Scholar
  47. 47.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)Google Scholar
  48. 48.
    Wulfmeier, M., Bewley, A., Posner, I.: Incremental adversarial domain adaptation for continually changing environments. In: ICRA (2018)Google Scholar
  49. 49.
    Xie, R., Yu, F., Wang, J., Wang, Y., Zhang, L.: Multi-level domain adaptive learning for cross-domain detection. In: ICCV Workshops (2019)Google Scholar
  50. 50.
    Xu, J., Xiao, L., López, A.M.: Self-supervised domain adaptation for computer vision tasks. arXiv abs/1907.10915 (2019)Google Scholar
  51. 51.
    Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_40CrossRefGoogle Scholar
  52. 52.
    Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: CVPR (2018)Google Scholar
  53. 53.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Sapienza University of RomeRomeItaly
  2. 2.Politecnico di TorinoTurinItaly
  3. 3.Italian Institute of TechnologyTurinItaly

Personalised recommendations