Abstract
In this paper, we study the task of unsupervised 2D image-based 3D shape retrieval (UIBSR), which aims to retrieve unlabeled shapes (target domain) using labeled images (source domain). Previous works on UIBSR mainly focus on aligning the prototypes generated by the source labels and predicted target pseudo labels for reducing the cross-domain discrepancy. However, simply maintaining consistency between features may corrupt the original semantic information. Moreover, the existing methods usually ignore the diversity of the instances during the adaptation process, which results in reducing the discrimination of features. To solve these problems, we propose the prototype-based semantic consistency (PSC) learning method, exploring semantic knowledge in both prototype-prototype and prototype-instance relationships in the probability space rather than the embedding space to preserve the structure of semantic information. Besides, we propose a novel adversarial scheme between feature extractor and classifier to explore the characteristic of different instances, which can further enhance the model to learn more robust representations. Extensive experiments on two challenging datasets demonstrate the superiority of our proposed method.
Similar content being viewed by others
References
Aubry, M., Russell, B.C.: Understanding deep features with computer-generated imagery. In: Proceedings of the IEEE international conference on computer vision, pp 2875–2883 (2015)
Baktashmotlagh, M., Harandi, M., Salzmann, M.: Distribution-matching embedding for visual domain adaptation. J. Mach. Learn. Res. 17, 1–30 (2016)
Bousmalis, K., Trigeorgis, G., Silberman, N., Krishnan, D., Erhan, D.: Domain separation networks. Adv. Neural Inf. Process. Syst. 29, 343–351 (2016)
Bousmalis, K., Silberman, N., Dohan, D., et al.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: CVPR, pp 95–104 (2017a)
Bousmalis, K,. Silberman, N., Dohan, D., et al.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3722–3731 (2017b)
Chen, T., Kornblith, S., Norouzi, M., et al.: A simple framework for contrastive learning of visual representations (2020)
Cui, S., Wang, S., Zhuo, J., et al.: Gradually vanishing bridge for adversarial domain adaptation. In: CVPR, pp 12452–12461 (2020)
Fu, H., Li, S., Jia, R., et al.: Hard example generation by texture synthesis for cross-domain shape similarity learning. CoRR abs/2010.12238 (2020)
Ganin, Y., Lempitsky, V.S.: Unsupervised domain adaptation by backpropagation. In: ICML, pp 1180–1189 (2015a)
Ganin, Y., Lempitsky, V.S.: Unsupervised domain adaptation by backpropagation. In: ICML, pp 1180–1189 (2015b)
Ge, Y., Chen, D., Li, H.: Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint arXiv:2001.01526 (2020)
Gong, B., Shi, Y., Sha, F., et al.: Geodesic flow kernel for unsupervised domain adaptation. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 2066–2073 (2012)
Gretton, A., Borgwardt, K.M., Rasch, M.J., et al.: A kernel two-sample test. J. Mach. Learn. Res. 13(1), 723–773 (2012)
Haeusser, P., Frerix, T., Mordvintsev, A., et al.: Associative domain adaptation. In: Proceedings of the IEEE international conference on computer vision, pp 2765–2773 (2017)
Häusser, P., Frerix, T., Mordvintsev, A., et al.: Associative domain adaptation. In: ICCV, pp 2784–2792 (2017)
Jing, L., Vahdani, E., Tan, J., et al.: Cross-modal center loss. In: CVPR (2020)
Kang, G., Jiang, L., Yang, Y., et al.: Contrastive adaptation network for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4893–4902 (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 60, 1097–1105 (2012)
Li, M., Zhai, YM., Luo, YW., et al.: Enhanced transport distance for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13936–13944 (2020)
Li, R., Jia, X., He, J., et al.: T-svdnet: Exploring high-order prototypical correlations for multi-source domain adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 9991–10,000 (2021a)
Li, S., Xie, M., Lv, F., et al.: Semantic concentration for domain adaptation. CVPR (2021b)
Li, Y., Su, H., Qi, C.R., et al.: Joint embeddings of shapes and images via cnn image purification. ACM Trans. Graph 34(6), 234:1-234:12 (2015)
Lin, MX., Yang, J., Wang, H., et al.: Single image 3d shape retrieval via cross-modal instance and category contrastive learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 11,405–11,415 (2021)
Long, M., Wang, J., Ding, G., et al.: Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE international conference on computer vision. pp 2200–2207 (2013)
Long, M., Zhu, H., Wang, J., et al.: Deep transfer learning with joint adaptation networks. In: ICML, pp 2208–2217 (2017)
Long, M., Cao, Z., Wang, J., et al.: Conditional adversarial domain adaptation. NeurIPS, 31, 1647–1657 (2018)
Luo, Y., Zheng, L., Guan ,T., et al.: Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In: CVPR, pp 2507–2516 (2019a)
Luo, Y., Zheng, L., Guan, T., et al.: Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2507–2516 (2019b)
Morerio, P., Cavazza, J., Murino, V.: Minimal-entropy correlation alignment for unsupervised deep domain adaptation. arXiv preprint arXiv:1711.10288 (2017)
Ni, J., Qiu, Q., Chellappa, R.: Subspace interpolation via dictionary learning for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 692–699 (2013)
Pei, Z., Cao, Z., Long, M., et al.: Multi-adversarial domain adaptation. In: AAAI, pp 3934–3941 (2018)
Phong, B.T.: Illumination for computer generated pictures. Commun. ACM 18(6), 311–317 (1998)
Saito, K., Watanabe, K., Ushiku, Y., et al.: Maximum classifier discrepancy for unsupervised domain adaptation. In: CVPR, pp 3723–3732 (2018)
Sankaranarayanan, S., Balaji, Y., Castillo, CD., et al.: Generate to adapt: Aligning domains using generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8503–8512 (2018)
Sener, O., Song, H.O., Saxena, A., et al.: Learning transferrable representations for unsupervised domain adaptation. Adv. Neural Inf. Process. Syst. 29, 2110–2118 (2016)
Sharma, A., Kalluri, T., Chandraker, M.: Instance level affinity-based transfer for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5361–5371 (2021)
Shuang, L., Mixue, X., Fangrui, L., et al.: Semantic concentration for domain adaptation. In: ICCV (2021)
Su, Y., Li, Y., Nie, W., et al.: Joint heterogeneous feature learning and distribution alignment for 2d image-based 3d object retrieval. IEEE Trans. Circuits Syst. Video Technol. 30(10), 3765–3776 (2020)
Su, Y., Li, Y., Song, D., et al.: Consistent domain structure learning and domain alignment for 2d image-based 3d objects retrieval. In: IJCAI, pp 883–889 (2020b)
Sun, B., Saenko, K.: Deep coral: Correlation alignment for deep domain adaptation. In: European conference on computer vision. Springer. pp 443–450 (2016)
Sun, X., Wu, J., Zhang, X., et al.: Pix3d: Dataset and methods for single-image 3d shape modeling (2018)
Tzeng, E., Hoffman, J., Zhang, N., et al.: Deep domain confusion: maximizing for domain invariance. arXiv preprint arXiv:1412.3474 (2014)
Tzeng, E., Hoffman, J., Saenko, K., et al.: Adversarial discriminative domain adaptation. In: CVPR, pp 2962–2971 (2017)
Wang, J., Feng, W., Chen, Y., et al.: Visual domain adaptation with manifold embedded distribution alignment. CoRR (2018)
Xie, S., Zheng, Z., Chen, L., et al.: Learning semantic representations for unsupervised domain adaptation. In: ICML, pp 5419–5428 (2018)
Yan, H., Ding, Y., Li, P., et al.: Mind the class weight bias: Weighted maximum mean discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2272–2281 (2017)
Yang, G., Tang, H., Zhong, Z., et al.: Transformer-based source-free domain adaptation. arXiv preprint arXiv:2105.14138 (2021)
Yanjun, M., Dianhai, Y., Tian, W., et al.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Domput. 1(1), 105–115 (2019)
Yue, X., Zheng, Z., Zhang, S., et al.: Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13,834–13,844 (2021)
Zhang, J., Li, W., Ogunbona, P.: Joint geometrical and statistical alignment for visual domain adaptation. In: CVPR, pp 5150–5158 (2017)
Zhang, W., Ouyang, W., Li, W., et al.: Collaborative and adversarial network for unsupervised domain adaptation. In: CVPR, pp 3801–3809 (2018)
Zhong, E., Fan, W., Peng, J., et al.: Cross domain distribution adaptation via kernel mapping. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1027–1036 (2009)
Zhou, H., Liu, A.A., Nie, W.: Dual-level embedding alignment network for 2d image-based 3d object retrieval. In: ACM Multimedia, pp 1667–1675 (2019)
Zhou, H., Nie, W., Li, W., et al.: Hierarchical instance feature alignment for 2d image-based 3d shape retrieval. In: IJCAI, pp 839–845 (2020a)
Zhou, H., Nie, W., Song, D., et al.: Semantic consistency guided instance feature alignment for 2d image-based 3d shape retrieval. In: ACM Multimedia, pp 925–933 (2020b)
Acknowledgements
This work was supported in part by the National Key Research and Development Program of China (2020YFB1709201), the National Natural Science Foundation of China~(U21B2024, U22A2068, 62202327), the China Postdoctoral Science Foundation~(2022M712369) and the Baidu Pinecone Program.
Author information
Authors and Affiliations
Contributions
A-AL: Conceptualization, Methodology. YZ: Software, Writing- Original draft preparation. CZ: Software, Writing- Reviewing and Editing. WL: Writing- Reviewing and Editing. BL: Supervision, Conceptualization. LL: Software, Visualization. XL: Validation, Investigation.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, AA., Zhang, Y., Zhang, C. et al. Prototype-based semantic consistency learning for unsupervised 2D image-based 3D shape retrieval. Multimedia Systems 29, 1995–2007 (2023). https://doi.org/10.1007/s00530-023-01086-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-023-01086-x