Skip to main content

Prompt-Based Self-training Framework for Few-Shot Named Entity Recognition

  • 433 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 13370)

Abstract

Exploiting unlabeled data is one of the plausible methods to improve few-shot named entity recognition (few-shot NER), where only a small number of labeled examples are given for each entity type. Existing works focus on learning deep NER models with self-training for few-shot NER. Self-training may induce incomplete and noisy labels which do not necessarily improve or even deteriorate the model performance. To address this challenge, we propose a prompt-based self-training framework. In the first stage, we introduce a self-training approach with prompt tuning to improve the model performance. Specially, we explore several label selection strategies in self-training to mitigate error propagation from noisy pseudo-labels. In the second stage, we fine-tune the BERT model over the high confidence pseudo-labels and original labels. We conduct experiments on two benchmark datasets. The results show that our method outperforms existing few-shot NER models by significant margins, demonstrating its effectiveness for the few-shot setting.

Keywords

  • Few-shot learning
  • Self-training
  • Prompt learning
  • Named entity recognition

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-031-10989-8_8
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-031-10989-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.

References

  1. Brown, T., et al.: Language models are few-shot learners. Adv. Neural Inf. Process. Sys. 33, 1877–1901 (2020)

    Google Scholar 

  2. Chen, L., Ruan, W., Liu, X., Lu, J.: SeqVAT: virtual adversarial training for semi-supervised sequence labeling. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8801–8811 (2020)

    Google Scholar 

  3. Clark, K., Luong, M.T., Manning, C.D., Le, Q.V.: Semi-supervised sequence modeling with cross-view training. arXiv preprint arXiv:1809.08370 (2018)

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  5. Ding, Z., Liu, K., Wang, W., Liu, B.: A semantic textual similarity calculation model based on pre-training model. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, S.-Y. (eds.) KSEM 2021. LNCS (LNAI), vol. 12816, pp. 3–15. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82147-0_1

    CrossRef  Google Scholar 

  6. Fries, J., Wu, S., Ratner, A., Ré, C.: SwellShark: a generative model for biomedical named entity recognition without labeled data. arXiv preprint arXiv:1704.06360 (2017)

  7. Giannakopoulos, A., Musat, C., Hossmann, A., Baeriswyl, M.: Unsupervised aspect term extraction with B-LSTM & CRF using automatically labelled datasets. In: EMNLP , vol. 180 (2017)

    Google Scholar 

  8. Hu, F., Lakdawala, S., Hao, Q., Qiu, M.: Low-power, intelligent sensor hardware interface for medical data preprocessing. IEEE Trans. Inf. Technol. Biomed. 13(4), 656–663 (2009)

    CrossRef  Google Scholar 

  9. Huang, J., et al.: Few-shot named entity recognition: an empirical baseline study. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 10408–10423 (2021)

    Google Scholar 

  10. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  11. Lee, S., Song, Y., Choi, M., Kim, H.: Bagging-based active learning model for named entity recognition with distant supervision. In: 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 321–324. IEEE (2016)

    Google Scholar 

  12. Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190 (2021)

  13. Li, Y., Song, Y., Jia, L., Gao, S., Li, Q., Qiu, M.: Intelligent fault diagnosis by fusing domain adversarial training and maximum mean discrepancy via ensemble learning. IEEE Trans. Indus. Inform. 17(4), 2833–2841 (2020)

    CrossRef  Google Scholar 

  14. Liang, C., et al.: BOND: BERT-assisted open-domain named entity recognition with distant supervision. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1054–1064 (2020)

    Google Scholar 

  15. Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021)

  16. Liu, X., Ji, K., Fu, Y., Du, Z., Yang, Z., Tang, J.: P-tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks. arXiv preprint arXiv:2110.07602 (2021)

  17. Liu, X., et al.: GPT understands, too. arXiv preprint arXiv:2103.10385 (2021)

  18. Miyato, T., Maeda, S.I., Koyama, M., Ishii, S.: Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1979–1993 (2018)

    Google Scholar 

  19. Peng, S., Zhang, Y., Yu, Y., Zuo, H., Zhang, K.: Named entity recognition based on reinforcement learning and adversarial training. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, S.-Y. (eds.) KSEM 2021. LNCS (LNAI), vol. 12815, pp. 191–202. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82136-4_16

    CrossRef  Google Scholar 

  20. Qiu, H., Zheng, Q., Msahli, M., Memmi, G., Qiu, M., Lu, J.: Topological graph convolutional network-based urban traffic flow and density prediction. IEEE Trans. Intell. Transp. Syst. 22(7), 4560–4569 (2020)

    CrossRef  Google Scholar 

  21. Sang, E.T.K., De Meulder, F.: Introduction to the conll-2003 shared task: language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 142–147 (2003)

    Google Scholar 

  22. Scudder, H.: Probability of error of some adaptive pattern-recognition machines. IEEE Trans. Inf. Theory 11(3), 363–371 (1965)

    MathSciNet  CrossRef  Google Scholar 

  23. Shang, J., Liu, L., Ren, X., Gu, X., Ren, T., Han, J.: Learning named entity tagger using domain-specific dictionary. In: EMNLP (2018)

    Google Scholar 

  24. Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA (2017)

    Google Scholar 

  25. Wang, Y., et al.: Meta self-training for few-shot neural sequence labeling. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 1737–1747 (2021)

    Google Scholar 

  26. Weischedel, R., et al.: Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, p. 23. Philadelphia (2013)

    Google Scholar 

  27. Yang, Y., Katiyar, A.: Simple and effective few-shot named entity recognition with structured nearest neighbor learning. arXiv preprint arXiv:2010.02405 (2020)

  28. Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd Annual Meeting of the Association for Computational Linguistics, pp. 189–196 (1995)

    Google Scholar 

  29. Zhang, Y., Shen, J., Shang, J., Han, J.: Empower entity set expansion via language model probing. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8151–8160 (2020)

    Google Scholar 

  30. Zoph, B., et al.: Rethinking pre-training and self-training. Adv. Neural Inf. Process. Syst. 33, 3833–3845 (2020)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Associate Editor and anonymous reviewers for their valuable comments and suggestions. This work is funded in part by the National Natural Science Foundation of China under Grants No. 62176029, and in part by the graduate research and innovation foundation of Chongqing, China under Grants No. CYB21063. This work also is supported in part by the National Key Research, Development Program of China under Grants 2017YFB1402400, Major Project of Chongqing Higher Education Teaching Reform Research (191003), and the New Engineering Research and Practice Project of the Ministry of Education (E-JSJRJ20201335).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiang Zhong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Huang, G., Zhong, J., Wang, C., Dai, Q., Li, R. (2022). Prompt-Based Self-training Framework for Few-Shot Named Entity Recognition. In: Memmi, G., Yang, B., Kong, L., Zhang, T., Qiu, M. (eds) Knowledge Science, Engineering and Management. KSEM 2022. Lecture Notes in Computer Science(), vol 13370. Springer, Cham. https://doi.org/10.1007/978-3-031-10989-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10989-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10988-1

  • Online ISBN: 978-3-031-10989-8

  • eBook Packages: Computer ScienceComputer Science (R0)