Advertisement

Label Propagation with Augmented Anchors: A Simple Semi-supervised Learning Baseline for Unsupervised Domain Adaptation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12349)

Abstract

Motivated by the problem relatedness between unsupervised domain adaptation (UDA) and semi-supervised learning (SSL), many state-of-the-art UDA methods adopt SSL principles (e.g., the cluster assumption) as their learning ingredients. However, they tend to overlook the very domain-shift nature of UDA. In this work, we take a step further to study the proper extensions of SSL techniques for UDA. Taking the algorithm of label propagation (LP) as an example, we analyze the challenges of adopting LP to UDA and theoretically analyze the conditions of affinity graph/matrix construction in order to achieve better propagation of true labels to unlabeled instances. Our analysis suggests a new algorithm of Label Propagation with Augmented Anchors (A\(^2\)LP), which could potentially improve LP via generation of unlabeled virtual instances (i.e., the augmented anchors) with high-confidence label predictions. To make the proposed A\(^2\)LP useful for UDA, we propose empirical schemes to generate such virtual instances. The proposed schemes also tackle the domain-shift challenge of UDA by alternating between pseudo labeling via A\(^2\)LP and domain-invariant feature learning. Experiments show that such a simple SSL extension improves over representative UDA methods of domain-invariant feature learning, and could empower two state-of-the-art methods on benchmark UDA datasets. Our results show the value of further investigation on SSL techniques for UDA problems.

Keywords

Domain adaptation Semi-supervised learning Label propagation 

Notes

Acknowledgment

This work is supported in part by the Guangdong R&D key project of China (Grant No.: 2019B010155001), the National Natural Science Foundation of China (Grant No.: 61771201), and the Program for Guangdong Introducing Innovative and Enterpreneurial Teams (Grant No.: 2017ZT07X183).

Supplementary material

504439_1_En_45_MOESM1_ESM.pdf (196 kb)
Supplementary material 1 (pdf 195 KB)

References

  1. 1.
  2. 2.
    Belkin, M., Matveeva, I., Niyogi, P.: Regularization and semi-supervised learning on large graphs. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 624–638. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-27819-1_43zbMATHCrossRefGoogle Scholar
  3. 3.
    Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1–2), 151–175 (2010)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: Advances in Neural Information Processing Systems, pp. 137–144 (2007)Google Scholar
  5. 5.
    Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. The MIT Press (2006).  https://doi.org/10.7551/mitpress/9780262033589.001.0001
  6. 6.
    Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: AISTATS, vol. 2005, pp. 57–64. Citeseer (2005)Google Scholar
  7. 7.
    Delalleau, O., Bengio, Y., Roux, N.L.: Efficient non-parametric function induction in semi-supervised learning. In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (2005). http://www.gatsby.ucl.ac.uk/aistats/fullpapers/204.pdf
  8. 8.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)Google Scholar
  9. 9.
    Ding, Z., Li, S., Shao, M., Fu, Y.: Graph adaptive knowledge transfer for unsupervised domain adaptation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 36–52. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01216-8_3CrossRefGoogle Scholar
  10. 10.
    Dong, W., Moses, C., Li, K.: Efficient k-nearest neighbor graph construction for generic similarity measures. In: Proceedings of the 20th International Conference on World Wide Web, pp. 577–586 (2011)Google Scholar
  11. 11.
    French, G., Mackiewicz, M., Fisher, M.: Self-ensembling for visual domain adaptation. In: International Conference on Learning Representations (2018)Google Scholar
  12. 12.
    Ganin, Y., et al.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 1–35 (2016)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: Advances in Neural Information Processing Systems, vol. 17, pp. 529–536 (2005)Google Scholar
  14. 14.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  15. 15.
    He, R., Lee, W.S., Ng, H.T., Dahlmeier, D.: Adaptive semi-supervised learning for cross-domain sentiment classification. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3467–3476 (2018)Google Scholar
  16. 16.
    Hestenes, M.R., Stiefel, E., et al.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stand. 49(6), 409–436 (1952)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Hou, C.A., Tsai, Y.H.H., Yeh, Y.R., Wang, Y.C.F.: Unsupervised domain adaptation with label and structural consistency. IEEE Trans. Image Process. 25(12), 5552–5562 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    Hui Tang, K.C., Jia, K.: Unsupervised domain adaptation via structurally regularized deep clustering. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)Google Scholar
  19. 19.
    Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Label propagation for deep semi-supervised learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5070–5079 (2019)Google Scholar
  20. 20.
    Iscen, A., Tolias, G., Avrithis, Y., Furon, T., Chum, O.: Efficient diffusion on region manifolds: recovering small objects with compact CNN representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2077–2086 (2017)Google Scholar
  21. 21.
    Joachims, T.: Transductive learning via spectral graph partitioning. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 290–297 (2003)Google Scholar
  22. 22.
    Kang, G., Jiang, L., Yang, Y., Hauptmann, A.G.: Contrastive adaptation network for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4893–4902 (2019)Google Scholar
  23. 23.
    Kumar, A., et al.: Co-regularized alignment for unsupervised domain adaptation. In: In: Advances in Neural Information Processing Systems, pp. 9345–9356 (2018)Google Scholar
  24. 24.
    Lee, S., Kim, D., Kim, N., Jeong, S.G.: Drop to adapt: Learning discriminative features for unsupervised domain adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 91–100 (2019)Google Scholar
  25. 25.
    Li, J., Jing, M., Lu, K., Zhu, L., Shen, H.T.: Locality preserving joint transfer for domain adaptation. IEEE Trans. Image Process. 28(12), 6103–6115 (2019)MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    Liu, Y., et al.: Learning to propagate labels: transductive propagation network for few-shot learning. arXiv preprint arXiv:1805.10002 (2018)
  27. 27.
    Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, ICML 2015, pp. 97–105. JMLR.org (2015). http://dl.acm.org/citation.cfm?id=3045118.3045130
  28. 28.
    Long, M., Cao, Z., Wang, J., Jordan, M.I.: Conditional adversarial domain adaptation. In: Advances in Neural Information Processing Systems, pp. 1640–1650 (2018)Google Scholar
  29. 29.
    Long, M., Zhu, H., Wang, J., Jordan, M.I.: Unsupervised domain adaptation with residual transfer networks. In: Advances in Neural Information Processing Systems, pp. 136–144 (2016)Google Scholar
  30. 30.
    Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: Learning bounds and algorithms. In: 22nd Conference on Learning Theory, COLT 2009 (2009)Google Scholar
  31. 31.
    Miyato, T., Maeda, S.i., Koyama, M., Ishii, S.: Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1979–1993 (2018)Google Scholar
  32. 32.
    Pan, S.J., Yang, Q., et al.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  33. 33.
    Peng, X., Usman, B., Kaushik, N., Hoffman, J., Wang, D., Saenko, K.: VisDA: the visual domain adaptation challenge. arXiv preprint arXiv:1710.06924 (2017)
  34. 34.
    Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-15561-1_16CrossRefGoogle Scholar
  35. 35.
    Saito, K., Ushiku, Y., Harada, T.: Asymmetric tri-training for unsupervised domain adaptation. arXiv preprint arXiv:1702.08400 (2017)
  36. 36.
    Saito, K., Watanabe, K., Ushiku, Y., Harada, T.: Maximum classifier discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3723–3732 (2018)Google Scholar
  37. 37.
    Shen, J., Qu, Y., Zhang, W., Yu, Y.: Wasserstein distance guided representation learning for domain adaptation. In: AAAI, pp. 4058–4065 (2018)Google Scholar
  38. 38.
    Shu, R., Bui, H.H., Narui, H., Ermon, S.: A DIRT-T approach to unsupervised domain adaptation. arXiv preprint arXiv:1802.08735 (2018)
  39. 39.
    Szummer, M., Jaakkola, T.: Partially labeled classification with Markov random walks. In: Advances in Neural Information Processing Systems, pp. 945–952 (2002)Google Scholar
  40. 40.
    Tang, H., Jia, K.: Discriminative adversarial domain adaptation. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, pp. 5940–5947. AAAI Press (2020)Google Scholar
  41. 41.
    Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems, pp. 1195–1204 (2017)Google Scholar
  42. 42.
    Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Simultaneous deep transfer across domains and tasks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4068–4076 (2015)Google Scholar
  43. 43.
    Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. arXiv preprint arXiv:1412.3474 (2014)
  44. 44.
    Xie, S., Zheng, Z., Chen, L., Chen, C.: Learning semantic representations for unsupervised domain adaptation. In: International Conference on Machine Learning, pp. 5419–5428 (2018)Google Scholar
  45. 45.
    Yan, H., Ding, Y., Li, P., Wang, Q., Xu, Y., Zuo, W.: Mind the class weight bias: Weighted maximum mean discrepancy for unsupervised domain adaptation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 3 (2017)Google Scholar
  46. 46.
    Zhang, Y., Deng, B., Tang, H., Zhang, L., Jia, K.: Unsupervised multi-class domain adaptation: theory, algorithms, and practice. CoRR abs/2002.08681 (2020)Google Scholar
  47. 47.
    Zhang, Y., Tang, H., Jia, K., Tan, M.: Domain-symmetric networks for adversarial domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5031–5040 (2019)Google Scholar
  48. 48.
    Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems, pp. 321–328 (2004)Google Scholar
  49. 49.
    Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Thrun, S., Saul, L.K., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16, pp. 321–328 (2004)Google Scholar
  50. 50.
    Zhou, Z.H., Li, M.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans. Knowl. Data Eng. 17(11), 1529–1541 (2005)CrossRefGoogle Scholar
  51. 51.
    Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical report, School of Computer Science, Carnegie Mellon University (2002)Google Scholar
  52. 52.
    Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 912–919 (2003)Google Scholar
  53. 53.
    Zhu, X., Lafferty, J., Rosenfeld, R.: Semi-supervised learning with graphs. Ph.D. thesis, Carnegie Mellon University, Language Technologies Institute, School of \(\ldots \) (2005)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.South China University of TechnologyGuangzhouChina
  2. 2.Pazhou LabGuangzhouChina
  3. 3.DAMO Academy, Alibaba GroupHangzhouChina
  4. 4.Department of ComputingThe Hong Kong Polytechnic UniversityHong KongHong Kong

Personalised recommendations