Transfer channel pruning for compressing deep domain adaptation models

Abstract

Deep unsupervised domain adaptation has recently received increasing attention from researchers. However, existing methods are computationally intensive due to the computational cost of convolutional neural networks (CNN) adopted by most work. There is no effective network compression method for such problem. In this paper, we propose a unified transfer channel pruning (TCP) method for accelerating deep unsupervised domain adaptation (UDA) models. TCP method is capable of compressing the deep UDA model by pruning less important channels while simultaneously learning transferable features by reducing the cross-domain distribution divergence. Therefore, it reduces the impact of negative transfer and maintains competitive performance on the target task. To the best of our knowledge, TCP method is the first approach that aims at accelerating deep unsupervised domain adaptation models. TCP method is validated on two main kinds of UDA methods: the discrepancy-based methods and the adversarial-based methods. In addition, it is validated on two benchmark datasets: Office-31 and ImageCLEF-DA with two common backbone networks - VGG16 and ResNet50. Experimental results demonstrate that our TCP method achieves comparable or better classification accuracy than other comparison methods while significantly reducing the computational cost. To be more specific, in VGG16, we get even higher accuracy after pruning 26% floating point operations (FLOPs); in ResNet50, we also get higher accuracy on half of the tasks after pruning 12% FLOPs for both discrepancy-based methods and adversarial-based methods.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Notes

  1. 1.

    http://imageclef.org/2014/adaptation

References

  1. 1.

    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  2. 2.

    Borgwardt KM, Gretton A, Rasch MJ, Kriegel HP, Schölkopf B, Smola AJ (2006) Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22(14):e49–e57

    Article  Google Scholar 

  3. 3.

    Chollet F (2017) Xception: deep learning with depthwise separable convolutions. arXiv preprint pp. 1610–02357

  4. 4.

    Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R (2014) Exploiting linear structure within convolutional networks for efficient evaluation. In: Advances in neural information processing systems, pp 1269–1277

  5. 5.

    Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML. pp 647–655

  6. 6.

    Durugkar I, Gemp I, Mahadevan S (2016) Generative multi-adversarial networks. arXiv preprint arXiv:1611.01673

  7. 7.

    Fernando B, Habrard A, Sebban M, Tuytelaars T (2013) Unsupervised visual domain adaptation using subspace alignment. In: Proceedings of the IEEE international conference on computer vision. Sydney, Australia, pp 2960–2967. http://www.iccv2013.org/

  8. 8.

    Ganin Y, Lempitsky V (2014) Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495

  9. 9.

    Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2096-30

    MathSciNet  MATH  Google Scholar 

  10. 10.

    Gong B, Grauman K, Sha F (2013) Connecting the dots with landmarks: discriminatively learning domain-invariant features for unsupervised domain adaptation. In: ICML. pp 222–230

  11. 11.

    Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y: Generative adversarial nets. In: Advances in neural information processing systems. pp 2672–2680 (2014)

  12. 12.

    Han S, Mao H, Dally WJ (2015) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149

  13. 13.

    Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. In: Advances in neural information processing systems. pp 1135–1143

  14. 14.

    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA, pp 770–778. http://cvpr2016.thecvf.com/

  15. 15.

    He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks. In: International conference on computer vision (ICCV), vol. 2. Venice, Italy. http://iccv2017.thecvf.com/

  16. 16.

    Hoffman J, Tzeng E, Park T, Zhu JY, Isola P, Saenko K, Efros AA, Darrell T (2018) Cycada: cycle-consistent adversarial domain adaptation. In: ICML

  17. 17.

    Hou CA, Tsai YHH, Yeh YR, Wang YCF (2016) Unsupervised domain adaptation with label and structural consistency. IEEE Trans Image Process 25(12):5552–5562

    MathSciNet  Article  Google Scholar 

  18. 18.

    Hu Y, Sun S, Li J, Wang X, Gu Q (2018) A novel channel pruning method for deep neural network compression. arXiv preprint arXiv:1805.11394

  19. 19.

    Huang J, Gretton A, Borgwardt K, Schölkopf B, Smola AJ (2007) Correcting sample selection bias by unlabeled data. In: Advances in neural information processing systems. pp 601–608

  20. 20.

    Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: ICML

  21. 21.

    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. pp 1097–1105

  22. 22.

    Lin J, Rao Y, Lu J, Zhou J (2017) Runtime neural pruning. In: Advances in neural information processing systems. pp 2181–2191

  23. 23.

    Long M, Cao Y, Wang J, Jordan MI (2015) Learning transferable features with deep adaptation networks. In: ICML

  24. 24.

    Long M, Wang J, Ding G, Sun J, Yu PS (2013) Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE international conference on computer vision. Sydney, Australia, pp 2200–2207. http://www.iccv2013.org/

  25. 25.

    Long M, Zhu H, Wang J, Jordan MI (2017) Deep transfer learning with joint adaptation networks. In: ICML

  26. 26.

    Luo JH, Wu J (2018) Autopruner: An end-to-end trainable filter pruning method for efficient deep model inference. arXiv preprint arXiv:1805.08941

  27. 27.

    Luo JH, Wu J, Lin W (2017) Thinet: A filter level pruning method for deep neural network compression. arXiv preprint arXiv:1707.06342

  28. 28.

    Molchanov P, Tyree S, Karras T, Aila T, Kautz J (2017) Pruning convolutional neural networks for resource efficient inference. In: ICLR

  29. 29.

    Motiian S, Jones Q, Iranmanesh S, Doretto G (2017) Few-shot adversarial domain adaptation. In: Advances in neural information processing systems. pp 6670–6680

  30. 30.

    Pan SJ, Kwok JT, Yang Q (2008) Transfer learning via dimensionality reduction. AAAI 8:677–682

    Google Scholar 

  31. 31.

    Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210

    Article  Google Scholar 

  32. 32.

    Pan SJ, Yang Q et al (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  33. 33.

    Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch

  34. 34.

    Patel VM, Gopalan R, Li R, Chellappa R (2015) Visual domain adaptation: a survey of recent advances. IEEE Signal Process Mag 32(3):53–69

    Article  Google Scholar 

  35. 35.

    Pei Z, Cao Z, Long M, Wang J (2018) Multi-adversarial domain adaptation. In: Thirty-second AAAI conference on artificial intelligence. New Orleans, Louisiana, USA. https://aaai.org/Conferences/AAAI-18/

  36. 36.

    Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: imagenet classification using binary convolutional neural networks. In: ECCV. Springer, pp 525–542

  37. 37.

    Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: European conference on computer vision. Springer, Hersonissos, Heraklion, Crete, Greece, pp 213–226. https://www.ics.forth.gr/eccv2010/intro.php/

    Google Scholar 

  38. 38.

    Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR

  39. 39.

    Sun B, Feng J, Saenko K (2016) Return of frustratingly easy domain adaptation. AAAI 6:8

    Google Scholar 

  40. 40.

    Sun B, Saenko K (2015) Subspace distribution alignment for unsupervised domain adaptation. In: BMVC. pp 24–1

  41. 41.

    Sun B, Saenko K (2016) Deep coral: Correlation alignment for deep domain adaptation. In: European conference on computer vision. Springer, Amsterdam, The Netherlands, pp 443–450. http://www.eccv2016.org/

    Google Scholar 

  42. 42.

    Tahmoresnezhad J, Hashemi S (2017) Visual domain adaptation via transfer feature learning. Knowl Inf Syst 50(2):585–605

    Article  Google Scholar 

  43. 43.

    Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. Honolulu, Hawaii, USA, pp 7167–7176. http://cvpr2017.thecvf.com/

  44. 44.

    Tzeng E, Hoffman J, Zhang N, Saenko K, Darrell T (2014) Deep domain confusion: maximizing for domain invariance. arXiv preprint arXiv:1412.3474

  45. 45.

    Wang J, Chen Y, Hao S, Feng W, Shen Z (2017) Balanced distribution adaptation for transfer learning. In: Data mining (ICDM), 2017 IEEE International Conference on IEEE. New Orleans, USA, pp 1129–1134. http://icdm2017.bigke.org/

  46. 46.

    Wang J, Feng W, Chen Y, Yu H, Huang M, Yu PS (2018) Visual domain adaptation with manifold embedded distribution alignment. In: 2018 ACM Multimedia Conference on Multimedia Conference. Seoul, Korea, pp 402–410. https://acmmm.org/2018/

  47. 47.

    Wang J, et al.: Everything about transfer learning and domain adapation. http://transferlearning.xyz

  48. 48.

    Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Advances in neural information processing systems. pp 3320–3328

  49. 49.

    Zhou A, Yao A, Guo Y, Xu L, Chen Y (2017) Incremental network quantization: Towards lossless cnns with low-precision weights. arXiv preprint arXiv:1702.03044

  50. 50.

    Zhuang F, Cheng X, Luo P, Pan SJ, He Q (2015) Supervised representation learning: transfer learning with deep autoencoders. In: IJCAI. pp 4119–4125

Download references

Acknowledgements

This work is supported in part by National Key Research & Development Plan of China (No. 2017YFB1002802), NSFC (No. 61572471), and Beijing Municipal Science & Technology Commission (No. Z171100000117017).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yiqiang Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yu, C., Wang, J., Chen, Y. et al. Transfer channel pruning for compressing deep domain adaptation models. Int. J. Mach. Learn. & Cyber. 10, 3129–3144 (2019). https://doi.org/10.1007/s13042-019-01004-6

Download citation

Keywords

  • Unsupervised domain adaptation
  • Transfer channel pruning
  • Accelerating