Skip to main content

Disentangled Contrastive Learning for Learning Robust Textual Representations

  • Conference paper
  • First Online:
Artificial Intelligence (CICAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13070))

Included in the following conference series:

Abstract

Although the self-supervised pre-training of transformer models has resulted in the revolutionizing of natural language processing (NLP) applications and the achievement of state-of-the-art results with regard to various benchmarks, this process is still vulnerable to small and imperceptible permutations originating from legitimate inputs. Intuitively, the representations should be similar in the feature space with subtle input permutations, while large variations occur with different meanings. This motivates us to investigate the learning of robust textual representation in a contrastive manner. However, it is non-trivial to obtain opposing semantic instances for textual samples. In this study, we propose a disentangled contrastive learning method that separately optimizes the uniformity and alignment of representations without negative sampling. Specifically, we introduce the concept of momentum representation consistency to align features and leverage power normalization while conforming the uniformity. Our experimental results for the NLP benchmarks demonstrate that our approach can obtain better results compared with the baselines, as well as achieve promising improvements with invariance tests and adversarial attacks. The code is available in https://github.com/zxlzr/DCL.

X. Chen and X. Xie—Equal contribution and shared co-first authorship.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/marcotcr/checklist.git.

  2. 2.

    https://github.com/thunlp/OpenAttack.

References

  1. Abe, F., Josh, A.: Understanding self-supervised and contrastive learning with bootstrap your own latent (BYOL). https://untitled-ai.github.io/understanding-self-supervised-contrastive-learning.html

  2. Arora, S., Khandeparkar, H., Khodak, M., Plevrakis, O., Saunshi, N.: A theoretical analysis of contrastive unsupervised representation learning. arXiv preprint arXiv:1902.09229 (2019)

  3. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: Proceedings of Machine Learning and Systems 2020, pp. 10719–10729 (2020)

    Google Scholar 

  4. Chi, Z., et al.: InfoXLM: an information-theoretic framework for cross-lingual language model pre-training. arXiv preprint arXiv:2007.07834 (2020)

  5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, pp. 4171–4186. Minneapolis, Minnesota, June 2019. https://doi.org/10.18653/v1/N19-1423

  6. Fang, H., Xie, P.: CERT: contrastive self-supervised learning for language understanding. arXiv preprint arXiv:2005.12766 (2020)

  7. Giorgi, J.M., Nitski, O., Bader, G.D., Wang, B.: DeCLUTR: deep contrastive learning for unsupervised textual representations. arXiv preprint arXiv:2006.03659 (2020)

  8. Grill, J.B., et al.: Bootstrap your own latent: a new approach to self-supervised learning. arXiv preprint arXiv:2006.07733 (2020)

  9. Gunel, B., Du, J., Conneau, A., Stoyanov, V.: Supervised contrastive learning for pre-trained language model fine-tuning. arXiv preprint arXiv:2011.01403 (2020)

  10. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.B.: Momentum contrast for unsupervised visual representation learning. CoRR abs/1911.05722 (2019)

    Google Scholar 

  11. Jin, D., Jin, Z., Zhou, J.T., Szolovits, P.: Is BERT really robust ? A strong baseline for natural language attack on text classification and entailment. arXiv:1907 (2019)

  12. Li, L., Qiu, X.: TextAT: adversarial training for natural language understanding with token-level perturbation. arXiv preprint arXiv:2004.14543 (2020)

  13. Li, L., et al.: Normal vs. adversarial: salience-based analysis of adversarial samples for relation extraction. arXiv preprint arXiv:2104.00312 (2021)

  14. Liu, T., Moore, A.W., Yang, K., Gray, A.G.: An investigation of practical approximate nearest neighbor algorithms. In: Advances in Neural Information Processing Systems, pp. 825–832 (2005)

    Google Scholar 

  15. Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)

    Google Scholar 

  16. Meng, Y., Zhang, Y., Huang, J., Zhang, Y., Zhang, C., Han, J.: Hierarchical topic mining via joint spherical tree and text embedding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1908–1917 (2020)

    Google Scholar 

  17. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26, pp. 3111–3119 (2013)

    Google Scholar 

  18. Mnih, A., Kavukcuoglu, K.: Learning word embeddings efficiently with noise-contrastive estimation. In: Advances in Neural Information Processing Systems 26, pp. 2265–2273 (2013)

    Google Scholar 

  19. Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1085–1097 (2019)

    Google Scholar 

  20. Rethmeier, N., Augenstein, I.: A primer on contrastive pretraining in language processing: methods, lessons learned and perspectives. arXiv preprint arXiv:2102.12982 (2021)

  21. Ribeiro, M.T., Wu, T., Guestrin, C., Singh, S.: Beyond accuracy: behavioral testing of NLP models with checklist. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, 5–10 July 2020, pp. 4902–4912. Association for Computational Linguistics (2020). https://www.aclweb.org/anthology/2020.acl-main.442/

  22. Santurkar, S., Tsipras, D., Ilyas, A., Madry, A.: How does batch normalization help optimization? In: Advances in Neural Information Processing Systems, pp. 2483–2493 (2018)

    Google Scholar 

  23. Saunshi, N., Plevrakis, O., Arora, S., Khodak, M., Khandeparkar, H.: A theoretical analysis of contrastive unsupervised representation learning. In: Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 5628–5637. PMLR, 9–15 June 2019, Long Beach, California, USA. http://proceedings.mlr.press/v97/saunshi19a.html

  24. Shen, S., Yao, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: PowerNorm: rethinking batch normalization in transformers. In: The proceedings of the International Conference on Machine Learning (ICML) (2020)

    Google Scholar 

  25. Tian, Y., Krishnan, D., Isola, P.: Contrastive multiview coding. CoRR abs/1906.05849 (2019). arxiv:1906.05849

  26. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: 7th International Conference on Learning Representations, ICLR 2019. OpenReview.net (2019). https://openreview.net/forum?id=rJ4km2R5t7

  27. Wang, T., Isola, P.: Understanding contrastive representation learning through alignment and uniformity on the hypersphere. arXiv preprint arXiv:2005.10242 (2020)

  28. Wei, X., Hu, Y., Weng, R., Xing, L., Yu, H., Luo, W.: On learning universal representations across languages. arXiv preprint arXiv:2007.15960 (2020)

  29. Wen, Y., Li, S., Jia, K.: Towards understanding the regularization of adversarial robustness on neural networks (2019)

    Google Scholar 

  30. Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 3733–3742. IEEE Computer Society (2018)

    Google Scholar 

  31. Wu, Z., Wang, S., Gu, J., Khabsa, M., Sun, F., Ma, H.: CLEAR: contrastive learning for sentence representation. arXiv preprint arXiv:2012.15466 (2020)

  32. Xiong, L., et al.: Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 (2020)

  33. Ye, H., et al.: Contrastive triple extraction with generative transformer. arXiv preprint arXiv:2009.06207 (2020)

  34. Ye, M., Zhang, X., Yuen, P.C., Chang, S.: Unsupervised embedding learning via invariant and spreading instance feature. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, pp. 6210–6219. Computer Vision Foundation/IEEE (2019)

    Google Scholar 

Download references

Acknowledgments

We want to express gratitude to the anonymous reviewers for their hard work and kind comments. This work is funded by NSFC91846204/NSFCU19B2027.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ningyu Zhang or Huajun Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, X. et al. (2021). Disentangled Contrastive Learning for Learning Robust Textual Representations. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13070. Springer, Cham. https://doi.org/10.1007/978-3-030-93049-3_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93049-3_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93048-6

  • Online ISBN: 978-3-030-93049-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics