Abstract
Training a model from scratch in a data-deficient environment is a challenging task. In this challenge, multiple differentiated backbones are used to train, and a number of tricks are used to assist in model training, such as initializing weights, mixup, and cutmix. Finally, we propose a three-stage model fusion to improve our accuracy. Our final accuracy of Top-1 on the public test set is 84.62421%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks (2018)
Bello, I., et al.: Revisiting resnets: improved training and scaling strategies. arXiv preprint arXiv:2103.07579 (2021)
Brock, A., De, S., Smith, S.L., Simonyan, K.: High-performance large-scale image recognition without normalization. arXiv preprint arXiv:2102.06171 (2021)
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Dou, Q., Coelho de Castro, D., Kamnitsas, K., Glocker, B.: Domain generalization via model-agnostic learning of semantic features. In: 32nd Proceedings of Conference on Advances in Neural Information Processing Systems(2019)
Ghifary, M., Kleijn, W.B., Zhang, M., Balduzzi, D.: Domain generalization for object recognition with multi-task autoencoders. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2551–2559 (2015)
Han, D., Yun, S., Heo, B., Yoo, Y.: RexNet: diminishing representational bottleneck on convolutional neural network. ArXiv abs/2007.00992 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Huang, L., Zhang, C., Zhang, H.: Self-adaptive training: beyond empirical risk minimization. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems. vol. 33, pp. 19365–19376. Curran Associates, Inc. (2020). https://proceedings.neurips.cc/paper/2020/file/e0ab531ec312161511493b002f9be2ee-Paper.pdf
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: 25th Proceedings of Conference on Advances in Neural Information Processing Systems (2012)
Li, Y., et al.: Deep domain generalization via conditional invariant adversarial networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 647–663. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_38
Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollar, P.: Designing network design spaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020)
Ridnik, T., Lawen, H., Noy, A., Friedman, I.: TresNet: high performance GPU-dedicated architecture. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1399–1408 (2021)
Shankar, S., Piratla, V., Chakrabarti, S., Chaudhuri, S., Jyothi, P., Sarawagi, S.: Generalizing across domains via cross-gradient training. arXiv preprint arXiv:1804.10745 (2018)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016)
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Tolstikhin, I., et al.: MLP-mixer: an all-MLP architecture for vision (2021)
Vaswani, A., et al.: Attention is all you need. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA. (2017)
Volpi, R., Namkoong, H., Sener, O., Duchi, J.C., Murino, V., Savarese, S.: Generalizing to unseen domains via adversarial data augmentation. In: 31st Proceedings of Conference on Advances in Neural Information Processing Systems (2018)
Wang, H., et al.: Cosface: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)
Yuan, L., Hou, Q., Jiang, Z., Feng, J., Yan, S.: Volo: Vision outlooker for visual recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: regularization strategy to train strong classifiers with localizable features (2019)
Zhang, H., et al.: Resnest: split-attention networks. ArXiv abs/2004.08955 (2020)
Zhang, X., Cui, P., Xu, R., Zhou, L., He, Y., Shen, Z.: Deep stable learning for out-of-distribution generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5372–5382 (June 2021)
Zhang, X., et al.: Towards domain generalization in object detection. arXiv preprint arXiv:2203.14387 (2022)
Zhang, X., Zhou, L., Xu, R., Cui, P., Shen, Z., Liu, H.: Nico++: towards better benchmarking for domain generalization. ArXiv abs/2204.08040 (2022)
Zhou, K., Liu, Z., Qiao, Y., Xiang, T., Loy, C.C.: Domain generalization: a survey. IEEE Trans. Pattern Anal. Mach. Intell. Early Access 1–20 (2022). https://doi.org/10.1109/TPAMI.2022.3195549
Acknowledgement
Throughout the writing of this dissertation I have received a great deal of support and assistance. I would first like to thank my tutors, for their valuable guidance throughout my studies. You provided me with the tools that I needed to choose the right direction and successfully complete my competition. I would particularly like to acknowledge my teammate, for their wonderful collaboration and patient support. Finally, I would not have been able to get in touch with this competition without the support of the organizer, NICO, who provided a good competition environment and reasonable competition opinions.
Thanks to the support of the National Natural Science Foundation of China (No. 62076192), Key Research and Development Program in Shaanxi Province of China (No. 2019ZDLGY03-06), the State Key Program of National Natural Science of China (No. 61836009), the Program for Cheung Kong Scholars and Innovative Research Team in University (No. IRT_15R53), The Fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project) (No. B07048), the Key Scientific Technological Innovation Research Project by Ministry of Education, the National Key Research and Development Program of China, and the CAAI Huawei MindSpore Open Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, J. et al. (2023). A Three-Stage Model Fusion Method for Out-of-Distribution Generalization. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-25075-0_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)