Skip to main content
Log in

Global–local Bi-alignment for purer unsupervised domain adaptation

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Unsupervised domain adaptation (UDA) aims to extract domain-invariant features. Existing UDA methods mainly utilize a convolutional neural network (CNN) or vision transformer (ViT) as feature extractor, which align in the latent space that characterizes single view of the object and may lead to matching error—the distributions aligned in the CNN space may still be confused in the ViT space. To address this, we introduce global–local bi-alignment (GLBA) based on a hybrid structure Conformer, which enforces simultaneous alignment in both spaces, following the space-independent assumption: If two domains have the same distribution, their distributions in any latent space are aligned. The framework can be easily combined with previous UDA methods, essentially adding only an alignment loss without the need for elaborate structures or large numbers of parameters. Experiments demonstrate the effectiveness of GLBA and its state-of-the-art (SoTA) performance achieved with comparable parameter complexity. Code is available at https://github.com/JSJ515-Group/GLBA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The four public open source datasets used in this work are directly available through references.

References

  1. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  2. Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1492–1500

  3. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inform Process Syst 25:1097–1105

    Google Scholar 

  4. Liu Z, Mao H, Wu C-Y, Feichtenhofer C, Darrell T, Xie S (2022) A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11976–11986

  5. Wang X, Girdhar R, Yu SX, Misra I (2023) Cut and learn for unsupervised object detection and instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3124–3134

  6. Gao Y, Yang L, Huang Y, Xie S, Li S, Zheng W-S (2022) Acrofod: An adaptive method for cross-domain few-shot object detection. In: European Conference on Computer Vision, pp 673–690

  7. Rajagopalan A et al. (2023) Improving robustness of semantic segmentation to motion-blur using class-centric augmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10470–10479

  8. Ben-David S, Blitzer J, Crammer K, Pereira F (2006) Analysis of representations for domain adaptation. Adv Neural Inform Process Syst 19:137–144

    Google Scholar 

  9. Zhang Y, Deng B, Tang H, Zhang L, Jia K (2020) Unsupervised multi-class domain adaptation: theory, algorithms, and practice. IEEE Trans Pattern Anal Mach Intell 44(5):2775–2792

    Google Scholar 

  10. Zhu Y, Zhuang F, Wang J, Ke G, Chen J, Bian J, Xiong H, He Q (2020) Deep subdomain adaptation network for image classification. IEEE Trans Neural Netw Learn Syst 32(4):1713–1722

    Article  MathSciNet  Google Scholar 

  11. Kang G, Jiang L, Yang Y, Hauptmann AG (2019) Contrastive adaptation network for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4893–4902

  12. Chen L, Chen H, Wei Z, Jin X, Tan X, Jin Y, Chen E (2022) Reusing the task-specific classifier as a discriminator: Discriminator-free adversarial domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7181–7190

  13. Zhang Y, Gao Y, Li H, Yin A, Zhang D, Chen X (2023) Crucial semantic classifier-based adversarial learning for unsupervised domain adaptation. arXiv preprint arXiv:2302.01708

  14. Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, March M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(59):1–35

    MathSciNet  Google Scholar 

  15. Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7167–7176

  16. Long M, Cao Z, Wang J, Jordan MI (2018) Conditional adversarial domain adaptation. Adv Neural Inform Process Syst 31:1647–1657

    Google Scholar 

  17. Cui S, Wang S, Zhuo J, Su C, Huang Q, Tian Q (2020) Gradually vanishing bridge for adversarial domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12455–12464

  18. Xu R, Liu P, Wang L, Chen C, Wang J (2020) Reliable weighted optimal transport for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4394–4403

  19. Yang J, Liu J, Xu N, Huang J (2023) Tvt: Transferable vision transformer for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 520–530

  20. Xu T, Chen W, Wang P, Wang F, Li H, Jin R (2021) Cdtrans: Cross-domain transformer for unsupervised domain adaptation. arXiv preprint arXiv:2109.06165

  21. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  22. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inform Process Syst 30:5998–6008

    Google Scholar 

  23. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141

  24. Zhang H, Wu C, Zhang Z, Zhu Y, Lin H, Zhang Z, Sun Y, He T, Mueller J, Manmatha R et al (2022) Resnest: Split-attention networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2736–2746

  25. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision, pp 213–229. Springer

  26. Shen Z, Bello I, Vemulapalli R, Jia X, Chen C-H (2020) Global self-attention networks for image recognition. arXiv preprint arXiv:2010.03019

  27. Peng Z, Huang W, Gu S, Xie L, Wang Y, Jiao J, Ye Q (2021) Conformer: Local features coupling global representations for visual recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 367–376

  28. Foret P, Kleiner A, Mobahi H, Neyshabur B (2020) Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412

  29. Rangwani H, Aithal SK, Mishra M, Jain A, Radhakrishnan VB (2022) A closer look at smoothness in domain adversarial training. In: International Conference on Machine Learning, pp 18378–18399. PMLR

  30. Wang M, Deng W (2018) Deep visual domain adaptation: a survey. Neurocomputing 312:135–153

    Article  Google Scholar 

  31. Qiao F, Zhao L, Peng X (2020) Learning to learn single domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12556–12565

  32. Chen J, Gao Z, Wu X, Luo J (2023) Meta-causal learning for single domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7683–7692

  33. Xu Q, Zhang R, Zhang Y, Wang Y, Tian Q (2021) A fourier-based framework for domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14383–14392

  34. Tzeng E, Hoffman J, Zhang N, Saenko K, Darrell T (2014) Deep domain confusion: maximizing for domain invariance. arXiv preprint arXiv:1412.3474

  35. Long M, Cao Y, Wang J, Jordan M (2015) Learning transferable features with deep adaptation networks. In: International Conference on Machine Learning, pp 97–105. PMLR

  36. Zhang Y, Liu T, Long M, Jordan M (2019) Bridging theory and algorithm for domain adaptation. In: International Conference on Machine Learning, pp 7404–7413. PMLR

  37. Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: International Conference on Machine Learning, pp 2208–2217. PMLR

  38. Du Z, Li J, Su H, Zhu L, Lu K (2021) Cross-domain gradient discrepancy minimization for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3937–3946

  39. Pei Z, Cao Z, Long M, Wang J (2018) Multi-adversarial domain adaptation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32

  40. Gao Z, Zhang S, Huang K, Wang Q, Zhong C (2021) Gradient distribution alignment certificates better adversarial domain adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 8937–8946

  41. Hoyer L, Dai D, Wang H, Van Gool L (2023) Mic: Masked image consistency for context-enhanced domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11721–11732

  42. Borgwardt KM, Gretton A, Rasch MJ, Kriegel H-P, Schölkopf B, Smola AJ (2006) Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22(14):49–57

    Article  Google Scholar 

  43. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neural Inform Process Syst 27:2672–2680

    Google Scholar 

  44. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784

  45. Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7794–7803

  46. Bello I, Zoph B, Vaswani A, Shlens J, Le QV (2019) Attention augmented convolutional networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3286–3295

  47. Hu H, Gu J, Zhang Z, Dai J, Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3588–3597

  48. Acuna D, Zhang G, Law MT, Fidler S (2021) f-domain adversarial learning: Theory and algorithms. In: International Conference on Machine Learning, pp 66–75. PMLR

  49. Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan JW (2010) A theory of learning from different domains. Mach Learn 79:151–175

    Article  MathSciNet  Google Scholar 

  50. Nguyen X, Wainwright MJ, Jordan MI (2010) Estimating divergence functionals and the likelihood ratio by convex risk minimization. IEEE Trans Inf Theory 56(11):5847–5861

    Article  MathSciNet  Google Scholar 

  51. Jin Y, Wang X, Long M, Wang J (2020) Minimum class confusion for versatile domain adaptation. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pp 464–480. Springer

  52. Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, pp 213–226. Springer

  53. Caputo B, Müller H, Martinez-Gomez J, Villegas M, Acar B, Patricia N, Marvasti N, Üsküdarlı S, Paredes R, Cazorla M et al (2014) Imageclef 2014: overview and analysis of the results. In: Information Access Evaluation. Multilinguality, Multimodality, and Interaction: 5th International Conference of the CLEF Initiative, CLEF 2014, Sheffield, UK, September 15-18, 2014. Proceedings 5, pp 192–211. Springer

  54. Venkateswara H, Eusebio J, Chakraborty S, Panchanathan S (2017) Deep hashing network for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5018–5027

  55. Peng X, Bai Q, Xia X, Huang Z, Saenko K, Wang B (2019) Moment matching for multi-source domain adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1406–1415

  56. Jiang J, Chen B, Fu B, Long M (2020) Transfer-learning-library. GitHub

  57. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inform Process Syst 32:8024–8035

    Google Scholar 

  58. Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3723–3732

  59. Lee C-Y, Batra T, Baig MH, Ulbricht D (2019) Sliced wasserstein discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10285–10295

  60. Cui S, Wang S, Zhuo J, Li L, Huang Q, Tian Q (2020) Towards discriminability and diversity: Batch nuclear-norm maximization under label insufficient situations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3941–3950

  61. Tang H, Jia K (2020) Discriminative adversarial domain adaptation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp 5940–5947

  62. Xu X, He H, Zhang H, Xu Y, He S (2019) Unsupervised domain adaptation via importance sampling. IEEE Trans Circ Syst Video Technol 30(12):4688–4699

    Article  Google Scholar 

  63. Wei G, Lan C, Zeng W, Chen Z (2021) Metaalign: Coordinating domain alignment and classification for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16643–16653

  64. Li S, Xie M, Gong K, Liu CH, Wang Y, Li W (2021) Transferable semantic augmentation for domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11516–11525

  65. Li S, Xie M, Lv F, Liu CH, Liang J, Qin C, Li W (2021) Semantic concentration for domain adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 9102–9111

  66. Zhou L, Xiao S, Ye M, Zhu X, Li S (2023) Adaptive mutual learning for unsupervised domain adaptation. IEEE Transactions on Circuits and Systems for Video Technology

  67. Zhang Y, Wang X, Liang J, Zhang Z, Wang L, Jin R, Tan T (2023) Free lunch for domain adversarial training: Environment label smoothing. arXiv preprint arXiv:2302.00194

  68. Luo Y-W, Ren C-X (2021) Conditional bures metric for domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13989–13998

  69. Wang H, Wang Z, Du M, Yang F, Zhang Z, Ding S, Mardziel P, Hu X (2020) Score-cam: Score-weighted visual explanations for convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 24–25

  70. Abnar S, Zuidema W (2020) Quantifying attention flow in transformers. arXiv preprint arXiv:2005.00928

  71. Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11):2579–2605

    Google Scholar 

Download references

Acknowledgements

This research was supported by the Research Foundation of the Institute of Environment-friendly Materials and Occupational Health (Wuhu) Anhui University of Science and Technology (No. ALW2021YF04) and the Medical Special Cultivation Project of Anhui University of Science and Technology (No. YZ2023H2C005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xingzhu Liang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, Ye., Liu, E., Liang, X. et al. Global–local Bi-alignment for purer unsupervised domain adaptation. J Supercomput (2024). https://doi.org/10.1007/s11227-024-06038-4

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06038-4

Keywords

Navigation