Advertisement

Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12351)

Abstract

Conventional unsupervised domain adaptation (UDA) studies the knowledge transfer between a limited number of domains. This neglects the more practical scenario where data are distributed in numerous different domains in the real world. A technique to measure domain similarity is critical for domain adaptation performance. To describe and learn relations between different domains, we propose a novel Domain2Vec model to provide vectorial representations of visual domains based on joint learning of feature disentanglement and Gram matrix. To evaluate the effectiveness of our Domain2Vec model, we create two large-scale cross-domain benchmarks. The first one is TinyDA, which contains 54 domains and about one million MNIST-style images. The second benchmark is DomainBank, which is collected from 56 existing vision datasets. We demonstrate that our embedding is capable of predicting domain similarities that match our intuition about visual relations between different domains. Extensive experiments are conducted to demonstrate the power of our new datasets in benchmarking state-of-the-art multi-source domain adaptation methods, as well as the advantage of our proposed model (Data and code are available at https://github.com/VisionLearningGroup/Domain2Vec).

Keywords

Unsupervised domain adaptation Domain vectorization 

Notes

Acknowledgements

We thank the anonymous reviewers for their comments and suggestions. This work was partially supported by NSF and Honda Research Institute.

Supplementary material

504443_1_En_45_MOESM1_ESM.pdf (4.6 mb)
Supplementary material 1 (pdf 4741 KB)

References

  1. 1.
    Gretton, A., Smola, A.J., Huang, J., Schmittfull, M., Borgwardt, K.M.: Covariate shift by kernel mean matching. In: Dataset Shift in Machine Learning. MIT Press (2009)Google Scholar
  2. 2.
    Long, M., Zhu, H., Wang, J., Jordan, M.I.: Deep transfer learning with joint adaptation networks. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, pp. 2208–2217 (2017)Google Scholar
  3. 3.
    Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. arXiv preprint arXiv:1412.3474 (2014)
  4. 4.
    Long, M., Cao, Y., Wang, J., Jordan, M.: Learning transferable features with deep adaptation networks. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning. Volume 37 of Proceedings of Machine Learning Research, PMLR, Lille, France, 07–09 July 2015, pp. 97–105 (2015)Google Scholar
  5. 5.
    Sun, B., Feng, J., Saenko, K.: Return of frustratingly easy domain adaptation. In: AAAI, vol. 6, p. 8 (2016)Google Scholar
  6. 6.
    Peng, X., Saenko, K.: Synthetic to real adaptation with generative correlation alignment networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018, Lake Tahoe, NV, USA, 12–15 March 2018, pp. 1982–1991 (2018)Google Scholar
  7. 7.
    Zellinger, W., Grubinger, T., Lughofer, E., Natschläger, T., Saminger-Platz, S.: Central moment discrepancy (CMD) for domain-invariant representation learning. CoRR abs/1702.08811 (2017)Google Scholar
  8. 8.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar
  9. 9.
    Hoffman, J., et al.: CyCADA: cycle-consistent adversarial domain adaptation. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Volume 80 of Proceedings of Machine Learning Research, PMLR, 10–15 July 2018, pp. 1989–1998. Stockholmsmässan, Stockholm (2018)Google Scholar
  10. 10.
    Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Advances in Neural Information Processing Systems, pp. 700–708 (2017)Google Scholar
  11. 11.
    Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: Computer Vision and Pattern Recognition (CVPR), vol. 1, p. 4 (2017)Google Scholar
  12. 12.
    Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning. Volume 37 of Proceedings of Machine Learning Research, PMLR, Lille, France, 07–09 July 2015, pp. 1180–1189 (2015)Google Scholar
  13. 13.
    Saito, K., Watanabe, K., Ushiku, Y., Harada, T.: Maximum classifier discrepancy for unsupervised domain adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  14. 14.
    Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
  15. 15.
    Peng, X., Zijun, H., Sun, X., Saenkp, K.: Domain agnostic learning with disentangled representations. arXiv preprint arXiv:1904.12347 (2019)
  16. 16.
    Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., Wang, B.: Moment matching for multi-source domain adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1406–1415 (2019)Google Scholar
  17. 17.
    Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-15561-1_16CrossRefGoogle Scholar
  18. 18.
    Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  19. 19.
    Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, pp. 647–655 (2014)Google Scholar
  20. 20.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  21. 21.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  22. 22.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  23. 23.
    Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5987–5995. IEEE (2017)Google Scholar
  24. 24.
    Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)Google Scholar
  25. 25.
    Achille, A., et al.: Task2Vec: task embedding for meta-learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6430–6439 (2019)Google Scholar
  26. 26.
    Deshmukh, A.A., Bansal, A., Rastogi, A.: Domain2Vec: deep domain generalization. arXiv preprint arXiv:1807.02919 (2018)
  27. 27.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)Google Scholar
  28. 28.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988. IEEE (2017)Google Scholar
  29. 29.
    French, G., Mackiewicz, M., Fisher, M.: Self-ensembling for visual domain adaptation. In: International Conference on Learning Representations (2018)Google Scholar
  30. 30.
    Ghifary, M., Kleijn, W.B., Zhang, M.: Domain adaptive neural networks for object recognition. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS (LNAI), vol. 8862, pp. 898–904. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-13560-1_76CrossRefGoogle Scholar
  31. 31.
    Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: Advances in Neural Information Processing Systems, pp. 469–477 (2016)Google Scholar
  32. 32.
    Liu, A.H., Liu, Y., Yeh, Y., Wang, Y.F.: A unified feature disentangler for multi-domain image translation and manipulation. CoRR abs/1809.01361 (2018)Google Scholar
  33. 33.
    Yi, Z., Zhang, H.R., Tan, P., Gong, M.: DualGAN: unsupervised dual learning for image-to-image translation. In: ICCV, pp. 2868–2876 (2017)Google Scholar
  34. 34.
    Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Volume 70 of Proceedings of Machine Learning Research, PMLR, 06–11 August 2017, pp. 1857–1865. International Convention Centre, Sydney (2017)Google Scholar
  35. 35.
    Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1–2), 151–175 (2010).  https://doi.org/10.1007/s10994-009-5152-4MathSciNetCrossRefGoogle Scholar
  36. 36.
    Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation with multiple sources. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 21, pp. 1041–1048. Curran Associates, Inc. (2009)Google Scholar
  37. 37.
    Crammer, K., Kearns, M., Wortman, J.: Learning from multiple sources. J. Mach. Learn. Res. 9(Aug), 1757–1774 (2008)MathSciNetzbMATHGoogle Scholar
  38. 38.
    Xu, R., Chen, Z., Zuo, W., Yan, J., Lin, L.: Deep cocktail network: multi-source unsupervised domain adaptation with category shift. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3964–3973 (2018)Google Scholar
  39. 39.
    Duan, L., Xu, D., Chang, S.F.: Exploiting web images for event recognition in consumer videos: a multiple source domain adaptation approach. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1338–1345. IEEE (2012)Google Scholar
  40. 40.
    Zhuang, F., Cheng, X., Luo, P., Pan, S.J., He, Q.: Supervised representation learning: transfer learning with deep autoencoders. In: IJCAI, pp. 4119–4125 (2015)Google Scholar
  41. 41.
    Mathieu, M.F., Zhao, J.J., Zhao, J., Ramesh, A., Sprechmann, P., LeCun, Y.: Disentangling factors of variation in deep representation using adversarial training. In: Advances in Neural Information Processing Systems, pp. 5040–5048 (2016)Google Scholar
  42. 42.
    Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. In: ICLR Workshop (2016)Google Scholar
  43. 43.
    Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Volume 70 of Proceedings of Machine Learning Research, PMLR, 06–11 August 2017, pp. 2642–2651. International Convention Centre, Sydney (2017)Google Scholar
  44. 44.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  45. 45.
    Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013)
  46. 46.
    Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 36–52. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01246-5_3CrossRefGoogle Scholar
  47. 47.
    Belghazi, M.I., et al.: Mutual information neural estimation. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Volume 80 of Proceedings of Machine Learning Research, PMLR, 10–15 July 2018, pp. 531–540. Stockholmsmässan, Stockholm (2018)Google Scholar
  48. 48.
    van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)zbMATHGoogle Scholar
  49. 49.
    Kiefer, J., Wolfowitz, J., et al.: Stochastic estimation of the maximum of a regression function. Ann. Math. Stat. 23(3), 462–466 (1952)MathSciNetCrossRefGoogle Scholar
  50. 50.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  51. 51.
    Venkateswara, H., Eusebio, J., Chakraborty, S., Panchanathan, S.: Deep hashing network for unsupervised domain adaptation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  52. 52.
    Peng, X., Usman, B., Kaushik, N., Hoffman, J., Wang, D., Saenko, K.: VisDA: the visual domain adaptation challenge. arXiv preprint arXiv:1710.06924 (2017)
  53. 53.
    LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)CrossRefGoogle Scholar
  54. 54.
    Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16(5), 550–554 (1994)CrossRefGoogle Scholar
  55. 55.
    Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: an extension of MNIST to handwritten letters. arXiv preprint arXiv:1702.05373 (2017)
  56. 56.
    Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., Ha, D.: Deep learning for classical Japanese literature. arXiv preprint arXiv:1812.01718 (2018)
  57. 57.
    Yadav, C., Bottou, L.: Cold case: the lost MNIST digits. arXiv preprint arXiv:1905.10498 (2019)
  58. 58.
    Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
  59. 59.
    Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)Google Scholar
  60. 60.
    Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011)CrossRefGoogle Scholar
  61. 61.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  62. 62.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007)Google Scholar
  63. 63.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010).  https://doi.org/10.1007/s11263-009-0275-4CrossRefGoogle Scholar
  64. 64.
    Busto, P.P., Gall, J.: Open set domain adaptation. In: The IEEE International Conference on Computer Vision (ICCV), vol. 1 (2017)Google Scholar
  65. 65.
    Busto, P.P., Iqbal, A., Gall, J.: Open set domain adaptation for image and action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 413–429 (2020)CrossRefGoogle Scholar
  66. 66.
    Cao, Z., Ma, L., Long, M., Wang, J.: Partial adversarial domain adaptation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 135–150 (2018)Google Scholar
  67. 67.
    Benesty, J., Chen, J., Huang, Y., Cohen, I.: Pearson correlation coefficient. In: Benesty, J., Chen, J., Huang, Y., Cohen, I. (eds.) Noise reduction in speech processing. STSP, vol. 2, pp. 1–4. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-00296-0_5CrossRefGoogle Scholar
  68. 68.
    Long, M., Zhu, H., Wang, J., Jordan, M.I.: Unsupervised domain adaptation with residual transfer networks. In: Advances in Neural Information Processing Systems, pp. 136–144 (2016)Google Scholar
  69. 69.
    Jain, L.P., Scheirer, W.J., Boult, T.E.: Multi-class open set recognition using probability of inclusion. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 393–409. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10578-9_26CrossRefGoogle Scholar
  70. 70.
    Saito, K., Yamamoto, S., Ushiku, Y., Harada, T.: Open set domain adaptation by backpropagation. CoRR abs/1804.10427 (2018)Google Scholar
  71. 71.
    Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: YouTube-BoundingBoxes: a large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5296–5305 (2017)Google Scholar
  72. 72.
    Cariucci, F.M., Porzi, L., Caputo, B., Ricci, E., Bulo, S.R.: AutoDIAL: automatic domain alignment layers. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5077–5085. IEEE (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Boston UniversityBostonUSA
  2. 2.Stanford UniversityStanfordUSA
  3. 3.MIT-IBM Watson AI LabBostonUSA

Personalised recommendations