Disentangled Contrastive Learning for Learning Robust Textual Representations

Chen, Xiang; Xie, Xin; Bi, Zhen; Ye, Hongbin; Deng, Shumin; Zhang, Ningyu; Chen, Huajun

doi:10.1007/978-3-030-93049-3_18

Xiang Chen^14,15,
Xin Xie^14,15,
Zhen Bi^14,15,
Hongbin Ye^14,15,
Shumin Deng^14,15,
Ningyu Zhang^14,15 &
…
Huajun Chen^14,15

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13070))

Included in the following conference series:

CAAI International Conference on Artificial Intelligence

1433 Accesses
2 Citations

Abstract

Although the self-supervised pre-training of transformer models has resulted in the revolutionizing of natural language processing (NLP) applications and the achievement of state-of-the-art results with regard to various benchmarks, this process is still vulnerable to small and imperceptible permutations originating from legitimate inputs. Intuitively, the representations should be similar in the feature space with subtle input permutations, while large variations occur with different meanings. This motivates us to investigate the learning of robust textual representation in a contrastive manner. However, it is non-trivial to obtain opposing semantic instances for textual samples. In this study, we propose a disentangled contrastive learning method that separately optimizes the uniformity and alignment of representations without negative sampling. Specifically, we introduce the concept of momentum representation consistency to align features and leverage power normalization while conforming the uniformity. Our experimental results for the NLP benchmarks demonstrate that our approach can obtain better results compared with the baselines, as well as achieve promising improvements with invariance tests and adversarial attacks. The code is available in https://github.com/zxlzr/DCL.

X. Chen and X. Xie—Equal contribution and shared co-first authorship.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abe, F., Josh, A.: Understanding self-supervised and contrastive learning with bootstrap your own latent (BYOL). https://untitled-ai.github.io/understanding-self-supervised-contrastive-learning.html
Arora, S., Khandeparkar, H., Khodak, M., Plevrakis, O., Saunshi, N.: A theoretical analysis of contrastive unsupervised representation learning. arXiv preprint arXiv:1902.09229 (2019)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: Proceedings of Machine Learning and Systems 2020, pp. 10719–10729 (2020)
Google Scholar
Chi, Z., et al.: InfoXLM: an information-theoretic framework for cross-lingual language model pre-training. arXiv preprint arXiv:2007.07834 (2020)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, pp. 4171–4186. Minneapolis, Minnesota, June 2019. https://doi.org/10.18653/v1/N19-1423
Fang, H., Xie, P.: CERT: contrastive self-supervised learning for language understanding. arXiv preprint arXiv:2005.12766 (2020)
Giorgi, J.M., Nitski, O., Bader, G.D., Wang, B.: DeCLUTR: deep contrastive learning for unsupervised textual representations. arXiv preprint arXiv:2006.03659 (2020)
Grill, J.B., et al.: Bootstrap your own latent: a new approach to self-supervised learning. arXiv preprint arXiv:2006.07733 (2020)
Gunel, B., Du, J., Conneau, A., Stoyanov, V.: Supervised contrastive learning for pre-trained language model fine-tuning. arXiv preprint arXiv:2011.01403 (2020)
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.B.: Momentum contrast for unsupervised visual representation learning. CoRR abs/1911.05722 (2019)
Google Scholar
Jin, D., Jin, Z., Zhou, J.T., Szolovits, P.: Is BERT really robust ? A strong baseline for natural language attack on text classification and entailment. arXiv:1907 (2019)
Li, L., Qiu, X.: TextAT: adversarial training for natural language understanding with token-level perturbation. arXiv preprint arXiv:2004.14543 (2020)
Li, L., et al.: Normal vs. adversarial: salience-based analysis of adversarial samples for relation extraction. arXiv preprint arXiv:2104.00312 (2021)
Liu, T., Moore, A.W., Yang, K., Gray, A.G.: An investigation of practical approximate nearest neighbor algorithms. In: Advances in Neural Information Processing Systems, pp. 825–832 (2005)
Google Scholar
Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
Google Scholar
Meng, Y., Zhang, Y., Huang, J., Zhang, Y., Zhang, C., Han, J.: Hierarchical topic mining via joint spherical tree and text embedding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1908–1917 (2020)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26, pp. 3111–3119 (2013)
Google Scholar
Mnih, A., Kavukcuoglu, K.: Learning word embeddings efficiently with noise-contrastive estimation. In: Advances in Neural Information Processing Systems 26, pp. 2265–2273 (2013)
Google Scholar
Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1085–1097 (2019)
Google Scholar
Rethmeier, N., Augenstein, I.: A primer on contrastive pretraining in language processing: methods, lessons learned and perspectives. arXiv preprint arXiv:2102.12982 (2021)
Ribeiro, M.T., Wu, T., Guestrin, C., Singh, S.: Beyond accuracy: behavioral testing of NLP models with checklist. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, 5–10 July 2020, pp. 4902–4912. Association for Computational Linguistics (2020). https://www.aclweb.org/anthology/2020.acl-main.442/
Santurkar, S., Tsipras, D., Ilyas, A., Madry, A.: How does batch normalization help optimization? In: Advances in Neural Information Processing Systems, pp. 2483–2493 (2018)
Google Scholar
Saunshi, N., Plevrakis, O., Arora, S., Khodak, M., Khandeparkar, H.: A theoretical analysis of contrastive unsupervised representation learning. In: Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 5628–5637. PMLR, 9–15 June 2019, Long Beach, California, USA. http://proceedings.mlr.press/v97/saunshi19a.html
Shen, S., Yao, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: PowerNorm: rethinking batch normalization in transformers. In: The proceedings of the International Conference on Machine Learning (ICML) (2020)
Google Scholar
Tian, Y., Krishnan, D., Isola, P.: Contrastive multiview coding. CoRR abs/1906.05849 (2019). arxiv:1906.05849
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: 7th International Conference on Learning Representations, ICLR 2019. OpenReview.net (2019). https://openreview.net/forum?id=rJ4km2R5t7
Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. arXiv preprint arXiv:2005.10242 (2020)
Wei, X., Hu, Y., Weng, R., Xing, L., Yu, H., Luo, W.: On learning universal representations across languages. arXiv preprint arXiv:2007.15960 (2020)
Wen, Y., Li, S., Jia, K.: Towards understanding the regularization of adversarial robustness on neural networks (2019)
Google Scholar
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 3733–3742. IEEE Computer Society (2018)
Google Scholar
Wu, Z., Wang, S., Gu, J., Khabsa, M., Sun, F., Ma, H.: CLEAR: contrastive learning for sentence representation. arXiv preprint arXiv:2012.15466 (2020)
Xiong, L., et al.: Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 (2020)
Ye, H., et al.: Contrastive triple extraction with generative transformer. arXiv preprint arXiv:2009.06207 (2020)
Ye, M., Zhang, X., Yuen, P.C., Chang, S.: Unsupervised embedding learning via invariant and spreading instance feature. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, pp. 6210–6219. Computer Vision Foundation/IEEE (2019)
Google Scholar

Download references

Acknowledgments

We want to express gratitude to the anonymous reviewers for their hard work and kind comments. This work is funded by NSFC91846204/NSFCU19B2027.

Author information

Authors and Affiliations

Zhejiang University & AZFT Joint Lab for Knowledge Engine, Hangzhou, China
Xiang Chen, Xin Xie, Zhen Bi, Hongbin Ye, Shumin Deng, Ningyu Zhang & Huajun Chen
Hangzhou Innovation Center, Zhejiang University, Hangzhou, China
Xiang Chen, Xin Xie, Zhen Bi, Hongbin Ye, Shumin Deng, Ningyu Zhang & Huajun Chen

Authors

Xiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xie
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Bi
View author publications
You can also search for this author in PubMed Google Scholar
Hongbin Ye
View author publications
You can also search for this author in PubMed Google Scholar
Shumin Deng
View author publications
You can also search for this author in PubMed Google Scholar
Ningyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Huajun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ningyu Zhang or Huajun Chen .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lu Fang
Duke University, Durham, NC, USA
Yiran Chen
Shanghai Jiao Tong University, Shanghai, China
Guangtao Zhai
University of British Columbia, Vancouver, BC, Canada
Jane Wang
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Ruiping Wang
Xidian University, Xi'an, China
Weisheng Dong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, X. et al. (2021). Disentangled Contrastive Learning for Learning Robust Textual Representations. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13070. Springer, Cham. https://doi.org/10.1007/978-3-030-93049-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-93049-3_18
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93048-6
Online ISBN: 978-3-030-93049-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Disentangled Contrastive Learning for Learning Robust Textual Representations