Minimum Class Confusion for Versatile Domain Adaptation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12366)


There are a variety of Domain Adaptation (DA) scenarios subject to label sets and domain configurations, including closed-set and partial-set DA, as well as multi-source and multi-target DA. It is notable that existing DA methods are generally designed only for a specific scenario, and may underperform for scenarios they are not tailored to. To this end, this paper studies Versatile Domain Adaptation (VDA), where one method can handle several different DA scenarios without any modification. Towards this goal, a more general inductive bias other than the domain alignment should be explored. We delve into a missing piece of existing methods: class confusion, the tendency that a classifier confuses the predictions between the correct and ambiguous classes for target examples, which is common in different DA scenarios. We uncover that reducing such pairwise class confusion leads to significant transfer gains. With this insight, we propose a general loss function: Minimum Class Confusion (MCC). It can be characterized as (1) a non-adversarial DA method without explicitly deploying domain alignment, enjoying faster convergence speed; (2) a versatile approach that can handle four existing scenarios: Closed-Set, Partial-Set, Multi-Source, and Multi-Target DA, outperforming the state-of-the-art methods in these scenarios, especially on one of the largest and hardest datasets to date (\(7.3\%\) on DomainNet). Its versatility is further justified by two scenarios proposed in this paper: Multi-Source Partial DA and Multi-Target Partial DA. In addition, it can also be used as a general regularizer that is orthogonal and complementary to a variety of existing DA methods, accelerating convergence and pushing these readily competitive methods to stronger ones. Code is available at


Versatile domain adaptation Minimum class confusion 



The work was supported by the Natural Science Foundation of China (61772299, 71690231), and China University S&T Innovation Plan by Ministry of Education.

Supplementary material (1.5 mb)
Supplementary material 1 (zip 1566 KB)


  1. 1.
    Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1–2), 151–175 (2010)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Cao, Z., Long, M., Wang, J., Jordan, M.I.: Partial transfer learning with selective adversarial networks. In: CVPR (2018)Google Scholar
  3. 3.
    Cao, Z., Ma, L., Long, M., Wang, J.: Partial adversarial domain adaptation. In: ECCV (2018)Google Scholar
  4. 4.
    Cao, Z., You, K., Long, M., Wang, J., Yang, Q.: Learning to transfer examples for partial domain adaptation. In: CVPR (2019)Google Scholar
  5. 5.
    Chen, X., Wang, S., Long, M., Wang, J.: Transferability vs. discriminability: batch spectral penalization for adversarial domain adaptation. In: ICML (2019)Google Scholar
  6. 6.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  7. 7.
    French, G., Mackiewicz, M., Fisher, M.: Self-ensembling for visual domain adaptation. In: ICLR (2018)Google Scholar
  8. 8.
    Ganin, Y., et al.: Domain-adversarial training of neural networks. JMLR 17(1), 2096–2130 (2016)MathSciNetGoogle Scholar
  9. 9.
    Goodfellow, I.J., et al.: Generative adversarial nets. In: NeurIPS (2014)Google Scholar
  10. 10.
    Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: NeurIPS (2005)Google Scholar
  11. 11.
    Gretton, A., et al.: Optimal kernel choice for large-scale two-sample tests. In: NeurIPS (2012)Google Scholar
  12. 12.
    Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: ICML (2017)Google Scholar
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  14. 14.
    Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
  15. 15.
    Hoffman, J., : CyCADA: cycle-consistent adversarial domain adaptation. In: ICML (2018)Google Scholar
  16. 16.
    Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. In: CVPR (2018)Google Scholar
  17. 17.
    Kang, G., Jiang, L., Yang, Y., Hauptmann, A.: Contrastive adaptation network for unsupervised domain adaptation. In: CVPR (2019)Google Scholar
  18. 18.
    Lee, C., Batra, T., Baig, M.H., Ulbricht, D.: Sliced wasserstein discrepancy for unsupervised domain adaptation. In: CVPR (2019)Google Scholar
  19. 19.
    Lee, S., Kim, D., Kim, N., Jeong, S.G.: Drop to adapt: learning discriminative features for unsupervised domain adaptation. In: ICCV (2019)Google Scholar
  20. 20.
    Liu, H., Long, M., Wang, J., Jordan, M.: Transferable adversarial training: a general approach to adapting deep classifiers. In: ICML (2019)Google Scholar
  21. 21.
    Long, M., Cao, Y., Wang, J., Jordan, M.I.J.: Learning transferable features with deep adaptation networks. In: ICML (2015)Google Scholar
  22. 22.
    Long, M., Cao, Z., Wang, J., Jordan, M.I.: Conditional adversarial domain adaptation. In: NeurIPS (2018)Google Scholar
  23. 23.
    Long, M., Zhu, H., Wang, J., Jordan, M.I.: Unsupervised domain adaptation with residual transfer networks. In: NeurIPS (2016)Google Scholar
  24. 24.
    Long, M., Zhu, H., Wang, J., Jordan, M.I.: Deep transfer learning with joint adaptation networks. In: ICML (2017)Google Scholar
  25. 25.
    von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
  27. 27.
    Pan, S.J., Yang, Q.: A survey on transfer learning. TKDE 22(10), 1345–1359 (2010)Google Scholar
  28. 28.
    Pan, Y., Yao, T., Li, Y., Wang, Y., Ngo, C.W., Mei, T.: Transferrable prototypical networks for unsupervised domain adaptation. In: CVPR (2019)Google Scholar
  29. 29.
    Pei, Z., Cao, Z., Long, M., Wang, J.: Multi-adversarial domain adaptation. In: AAAI (2018)Google Scholar
  30. 30.
    Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., Wang, B.: Moment matching for multi-source domain adaptation. In: ICCV (2019)Google Scholar
  31. 31.
    Peng, X., Huang, Z., Sun, X., Saenko, K.: Domain agnostic learning with disentangled representations. In: ICML (2019)Google Scholar
  32. 32.
    Peng, X., Usman, B., Kaushik, N., Hoffman, J., Wang, D., Saenko, K.: VisDA: the visual domain adaptation challenge. arXiv preprint arXiv:1710.06924 (2017)
  33. 33.
    Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009)Google Scholar
  34. 34.
    Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010). Scholar
  35. 35.
    Saito, K., Watanabe, K., Ushiku, Y., Harada, T.: Maximum classifier discrepancy for unsupervised domain adaptation. In: CVPR (2018)Google Scholar
  36. 36.
    Sankaranarayanan, S., Balaji, Y., Castillo, C.D., Chellappa, R.: Generate to adapt: aligning domains using generative adversarial networks. In: CVPR (2018)Google Scholar
  37. 37.
    Schutze, H., Manning, C.D., Raghavan, P.: Introduction to information retrieval. In: Proceedings of the International Communication of Association for Computing Machinery Conference (2008)Google Scholar
  38. 38.
    Shu, R., Bui, H.H., Narui, H., Ermon, S.: A DIRT-T approach to unsupervised domain adaptation. In: ICLR (2018)Google Scholar
  39. 39.
    Sun, B., Saenko, K.: Deep CORAL: correlation alignment for deep domain adaptation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 443–450. Springer, Cham (2016). Scholar
  40. 40.
    Tarvainen, A., Valpola, H.: Weight-averaged consistency targets improve semi-supervised deep learning results. In: NeurIPS (2017)Google Scholar
  41. 41.
    Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Simultaneous deep transfer across domains and tasks. In: ICCV (2015)Google Scholar
  42. 42.
    Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: CVPR (2017)Google Scholar
  43. 43.
    Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. CoRR abs/1412.3474 (2014)Google Scholar
  44. 44.
    Venkateswara, H., Eusebio, J., Chakraborty, S., Panchanathan, S.: Deep hashing network for unsupervised domain adaptation. In: CVPR (2017)Google Scholar
  45. 45.
    Wang, X., Jin, Y., Long, M., Wang, J., Jordan, M.I.: Transferable normalization: towards improving transferability of deep neural networks. In: Advances in Neural Information Processing Systems, pp. 1953–1963 (2019)Google Scholar
  46. 46.
    Wang, X., Li, L., Ye, W., Long, M., Wang, J.: Transferable attention for domain adaptation. In: AAAI (2019)Google Scholar
  47. 47.
    Xu, R., Chen, Z., Zuo, W., Yan, J., Lin, L.: Deep cocktail network: multi-source unsupervised domain adaptation with category shift. In: CVPR (2018)Google Scholar
  48. 48.
    Xu, R., Li, G., Yang, J., Lin, L.: Larger norm more transferable: an adaptive feature norm approach. In: ICCV (2019)Google Scholar
  49. 49.
    You, K., Wang, X., Long, M., Jordan, M.: Towards accurate model selection in deep unsupervised domain adaptation. In: ICML (2019)Google Scholar
  50. 50.
    Zhang, J., Ding, Z., Li, W., Ogunbona, P.: Importance weighted adversarial nets for partial domain adaptation. In: CVPR (2018)Google Scholar
  51. 51.
    Zhang, Y., Tang, H., Jia, K., Tan, M.: Domain-symmetric networks for adversarial domain adaptation. In: CVPR (2019)Google Scholar
  52. 52.
    Zhang, Y., Liu, T., Long, M., Jordan, M.I.: Bridging theory and algorithm for domain adaptation. In: ICML (2019)Google Scholar
  53. 53.
    Zhao, H., Zhang, S., Wu, G., Moura, J.M., Costeira, J.P., Gordon, G.J.: Adversarial multiple source domain adaptation. In: NeurIPS (2018)Google Scholar
  54. 54.
    Zou, Y., Yu, Z., Kumar, B., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: ECCV (2018)Google Scholar
  55. 55.
    Zou, Y., Yu, Z., Liu, X., Kumar, B.V., Wang, J.: Confidence regularized self-training. In: ICCV (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.School of Software, BNRist, Research Center for Big DataTsinghua UniversityBeijingChina

Personalised recommendations