Enhanced Few-Shot Learning with Multiple-Pattern-Exploiting Training

Zeng, Jiali; Jiang, Yufan; Wu, Shuangzhi; Li, Mu

doi:10.1007/978-3-030-88483-3_31

Jiali Zeng¹²,
Yufan Jiang¹²,
Shuangzhi Wu¹² &
…
Mu Li¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13029))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

1499 Accesses
1 Citations

Abstract

The NLPCC 2021 Few-shot Learning for Chinese Language Understanding Evaluation (FewCLUE) shared task seeks for the best solution to few-shot learning tasks with pre-trained language models. This paper presents Tencent Cloud Xiaowei’s approach to this challenge, which won the 2st place in the contest. We propose a Multiple-Pattern-Exploiting Training method (MPET) for the challenge. Different from the original PET, MPET constructs multiple patterns to enhance the model’s generalization capability. We take the MPET as an auxiliary task, and jointly optimize classification and MPET. Empirical results show that our MPET is effective to few-shot learning tasks.

J. Zeng and Y. Jiang—contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://challenge.xfyun.cn/2019/gamelist..
2.
https://github.com/xiaobu-coai/BUSTM..
3.
PET computes class probabilities using the logits that correspond to the labels for a specific task. In contrast, inspired by ADAPET [11], we computes the probability of each token in the vocabulary tokens in this paper.
4.
https://github.com/huggingface/transformers.
5.
We replace the token “#idom” in content with “[MASK]”, and make it as a regular MLM objective.

References

Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Màrquez, L., Callison-Burch, C., Su, J., Pighin, D., Marton, Y. (eds.) Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, 17–21 September 2015, Lisbon, Portugal, pp. 632–642. The Association for Computational Linguistics (2015). https://doi.org/10.18653/v1/d15-1075
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Gururangan, S., et al.: Don’t stop pretraining: adapt language models to domains and tasks. arXiv preprint arXiv:2004.10964 (2020)
Hu, H., Richardson, K., Xu, L., Li, L., Kübler, S., Moss, L.S.: OCNLI: original Chinese natural language inference. In: Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP 2020, Online Event, 16–20 November 2020. Findings of ACL, vol. EMNLP 2020, pp. 3512–3526. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.findings-emnlp.314
Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.H.: RACE: large-scale reading comprehension dataset from examinations. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, 9–11 September 2017, Copenhagen, Denmark, pp. 785–794. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/d17-1082
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)
Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100, 000+ Questions for machine comprehension of text. In: Su, J., Carreras, X., Duh, K. (eds.) Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, 1–4 November 2016, Austin, Texas, USA, pp. 2383–2392. The Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/d16-1264
Schick, T., Schütze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: Merlo, P., Tiedemann, J., Tsarfaty, R. (eds.) Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, 19–23 April 2021, pp. 255–269. Association for Computational Linguistics (2021). https://www.aclweb.org/anthology/2021.eacl-main.20/
Tam, D., Menon, R.R., Bansal, M., Srivastava, S., Raffel, C.: Improving and simplifying pattern exploiting training. CoRR abs/2103.11955 (2021). https://arxiv.org/abs/2103.11955
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, 6–9 May 2019, LA, USA. OpenReview.net (2019). https://openreview.net/forum?id=rJ4km2R5t7
Whang, T., Lee, D., Lee, C., Yang, K., Oh, D., Lim, H.: Domain adaptive training BERT for response selection. arXiv preprint arXiv:1908.04812 (2019)
Xu, L., et al.: CLUE: a Chinese language understanding evaluation benchmark. In: Scott, D., Bel, N., Zong, C. (eds.) Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, 8–13 December 2020, Barcelona, Spain (Online), pp. 4762–4772. International Committee on Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.coling-main.419
Xu, L., et al.: FewCLUE: a Chinese few-shot learning evaluation benchmark. CoRR abs/2107.07498 (2021). arXiv: 2107.07498
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Zheng, C., Huang, M., Sun, A.: ChiD: a large-scale Chinese idiom dataset for cloze test. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, 28 July–2 August 2019, Florence, Italy, vol. 1, Long Papers, pp. 778–787. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/p19-1075

Download references

Author information

Authors and Affiliations

Tencent Cloud Xiaowei, Beijing, China
Jiali Zeng, Yufan Jiang, Shuangzhi Wu & Mu Li

Authors

Jiali Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Yufan Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Shuangzhi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Mu Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiali Zeng .

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Lu Wang
Peking University, Beijing, China
Yansong Feng
Soochow University, Suzhou, China
Yu Hong
Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zeng, J., Jiang, Y., Wu, S., Li, M. (2021). Enhanced Few-Shot Learning with Multiple-Pattern-Exploiting Training. In: Wang, L., Feng, Y., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2021. Lecture Notes in Computer Science(), vol 13029. Springer, Cham. https://doi.org/10.1007/978-3-030-88483-3_31

Download citation

DOI: https://doi.org/10.1007/978-3-030-88483-3_31
Published: 06 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88482-6
Online ISBN: 978-3-030-88483-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)