Skip to main content
Log in

Few-shot named entity recognition with hybrid multi-prototype learning

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Information extraction provides the basic technical support for knowledge graph construction and Web applications. Named entity recognition (NER) is one of the fundamental tasks of information extraction. Recognizing unseen entities from numerous contents with the support of only a few labeled samples, also termed as few-shot learning, is a crucial issue to be studied. Few-shot NER aims at identifying emerging named entities from the context with the support of a few labeled samples. Existing methods mainly use the same strategy to construct a single prototype for each entity or non-entity class, which has limited expressiveness power and even biased representation. In this work, we propose a novel hybrid multi-prototype class representation approach. Specifically, for entity classes, we first insert labels after entities in support sentences to enrich the learned token and label embeddings with more contextual information. Then, for each entity span, the contextual token embeddings are averaged to form its entity-level prototype, while the contextual label embedding is considered as its label-level prototype. The set of prototypes for all entities in a class constitutes the multi-prototype of this entity class. For non-entity class, we directly use the set of token embeddings to represent it, where multi-prototype refers to the multiple token embeddings. By treating the entity and non-entity classes differently, our hybrid strategy can extract more precise class representations from the support examples. Furthermore, we establish a harder and more reasonable experimental setting of few-shot NER by offering a rigorous sampling strategy. Extensive empirical results show that our proposal improves performance over prior models on popular benchmark Few-NERD under both loose and our proposed rigorous sampling constraints, achieving comparable performance to current state-of-the-arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The code and the data are at https://github.com/liaozenghua/HMP.git.

Notes

  1. In the rest of the paper, we may use the word “sequence” to refer to “sentence”.

  2. In this work, we denote these relaxed few-shot settings as N-way \(\tilde {K}\)-shot. The actual average K value is denoted as K.

  3. Please refer to [35] for more details.

  4. https://github.com/kotwanikunal/entity-recognition-datasets

  5. https://huggingface.co/

  6. https://github.com/thunlp/Few-NERD

  7. https://pytorch.org/

References

  1. Bai, L., Zhang, M., Zhang, H., Zhang, H.: Ftmf: few-shot temporal knowledge graph completion based on meta-optimization and fault-tolerant mechanism. World Wide Web:1–28 (2022)

  2. Chiu, J.P.C., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Trans. Assoc. Comput. Linguis. 4, 357–370 (2016)

    Article  Google Scholar 

  3. Cui, L., Wu, Y., Liu, J., Yang, S., Zhang, Y.: Template-based named entity recognition using BART. In: Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021, Findings of ACL, vol, ACL/IJCNLP 2021, pp. 1835–1845 (2021)

  4. Das, S.S.S., Katiyar, A., Passonneau, R., Zhang, R.: CONTaiNER: few-shot named entity recognition via contrastive learning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 6338–6353. Association for Computational Linguistics, Dublin (2022). https://doi.org/10.18653/v1/2022.acl-long.439

  5. Deng, J., Guo, J., Liu, T., Gong, M., Zafeiriou, S.: Sub-center arcface: boosting face recognition by large-scale noisy Web faces. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) Computer Vision – ECCV 2020, pp. 741–757. Springer International Publishing, Cham (2020)

  6. Derczynski, L., Nichols, E., van Erp, M., Limsopatham, N.: Results of the WNUT2017 shared task on novel and emerging entity recognition. In: Proceedings of the 3Rd Workshop on Noisy User-generated Text, pp. 140–147. Copenhagen (2017)

  7. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Minneapolis (2019)

  8. Ding, N., Xu, G., Chen, Y., Wang, X., Han, X., Xie, P., Zheng, H., Liu, Z.: Few-NERD: a few-shot named entity recognition dataset. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3198–3213. Online (2021)

  9. Eberts, M., Pech, K., Ulges, A.: Manyent: a dataset for few-shot entity typing. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 5553–5557 (2020)

  10. Feng, X., Feng, X., Qin, B., Feng, Z., Liu, T.: Improving low resource named entity recognition using cross-lingual knowledge transfer. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, pp. 4071–4077. Stockholm (2018)

  11. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 70, pp. 1126–1135. PMLR (2017). https://proceedings.mlr.press/v70/finn17a.html

  12. Fritzler, A., Logacheva, V., Kretov, M.: Few-shot classification in named entity recognition task. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC ’19, pp. 993–1000. New York (2019)

  13. Gao, T., Han, X., Zhu, H., Liu, Z., Li, P., Sun, M., Zhou, J.: FewRel 2.0: towards more challenging few-shot relation classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint (EMNLP-IJCNLP), pp. 6250–6255. Hong Kong (2019)

  14. Han, X., Zhu, H., Yu, P., Wang, Z., Yao, Y., Liu, Z., Sun, M.: FewRel: a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4803–4809. Brussels (2018)

  15. Hou, Y., Che, W., Lai, Y., Zhou, Z., Liu, Y., Liu, H., Liu, T.: Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1381–1393. Online (2020)

  16. Huang, J., Li, C., Subudhi, K., Jose, D., Balakrishnan, S., Chen, W., Peng, B., Gao, J., Han, J.: Few-shot named entity recognition: an empirical baseline study. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 10408–10423 (2021)

  17. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260–270. San Diego (2016)

  18. Li, J., Chiu, B., Feng, S., Wang, H.: Few-shot named entity recognition via meta-learning. IEEE Trans. Knowl. Data Eng.:1–1 (2020)

  19. Li, J., Shang, S., Shao, L.: Metaner: named entity recognition with meta-learning. In: Proceedings of The Web Conference 2020, pp. 429–440 (2020)

  20. Li, M., Li, Z., Yang, Q., Chen, Z., Zhao, P., Zhao, L.: A crowd-efficient learning approach for ner based on online encyclopedia. World Wide Web 23(1), 453–470 (2020)

    Article  Google Scholar 

  21. Li, X., Yin, H., Zhou, K., Zhou, X.: Semi-supervised clustering with deep metric learning and graph embedding. World Wide Web 23(2), 781–798 (2020)

    Article  Google Scholar 

  22. Lin, S., Gao, J., Zhang, S., He, X., Sheng, Y., Chen, J.: A continuous learning method for recognizing named entities by integrating domain contextual relevance measurement and web farming mode of web intelligence. World Wide Web 23(3), 1769–1790 (2020)

    Article  Google Scholar 

  23. Liu, F., Mao, Q., Wang, L., Ruwa, N., Gou, J., Zhan, Y.: An emotion-based responding model for natural language conversation. World Wide Web 22(2), 843–861 (2019)

    Article  Google Scholar 

  24. Liu, K., Liu, W., Ma, H., Huang, W., Dong, X.: Generalized zero-shot learning for action recognition with web-scale video data. World Wide Web 22(2), 807–824 (2019)

    Article  Google Scholar 

  25. Ma, T., Jiang, H., Wu, Q., Zhao, T., Lin, C.Y.: Decomposed meta-learning for few-shot named entity recognition. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 1584–1596. Association for Computational Linguistics, Dublin (2022). https://doi.org/10.18653/v1/2022.findings-acl.124

  26. Ma, Y., Cambria, E., Gao, S.: Label embedding for zero-shot fine-grained named entity typing. In: COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11-16, 2016, pp. 171–180, Osaka (2016)

  27. Miller, E., Matsakis, N., Viola, P.: Learning from one example through shared densities on transforms. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662), vol. 1, pp. 464–471 (2000)

  28. Nguyen, H.V., Gelli, F., Poria, S.: DOZEN: cross-domain zero shot named entity recognition with knowledge graph. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1642–1646 (2021)

  29. Petroni, F., Rocktäschel, T., Riedel, S., Lewis, P.S.H., Bakhtin, A., Wu, Y., Miller, A.H.: Language models as knowledge bases?. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) pp. 2463–2473 (2019)

  30. Sun, C., Huang, L., Qiu, X.: Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 380–385 (2019)

  31. Sun, S., Sun, Q., Zhou, K., Lv, T.: Hierarchical attention prototypical networks for few-shot text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, November 3-7, 2019, pp. 476–485. Hong Kong (2019)

  32. Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the coNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL, 2003, pp. 142–147 (2003)

  33. Tong, M., Wang, S., Xu, B., Cao, Y., Liu, M., Hou, L., Li, J.: Learning from miscellaneous other-class words for few-shot named entity recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 6236–6247. Online (2021)

  34. Tong, M., Wang, S., Xu, B., Cao, Y., Liu, M., Hou, L., Li, J.: Learning from miscellaneous other-class words for few-shot named entity recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pp. 6236–6247 (2021)

  35. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 6000–6010. Curran Associates Inc., Red Hook (2017)

  36. Wang, Y., Chu, H., Zhang, C., Gao, J.: Learning from language description: low-shot named entity recognition via decomposed framework. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 1618–1630. Association for Computational Linguistics, Punta Cana (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.139

  37. Wen, W., Liu, Y., Ouyang, C., Lin, Q., Chung, T.: Enhanced prototypical network for few-shot relation extraction. Inf. Process. Manag. 58(4), 102596 (2021)

    Article  Google Scholar 

  38. Xu, L., Zhang, X., Zhao, X., Chen, H., Chen, F., Choi, J.D.: Boosting cross-lingual transfer via self-learning with uncertainty estimation. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 6716–6723 (2021)

  39. Yang, Y., Katiyar, A.: Simple and effective few-shot named entity recognition with structured nearest neighbor learning. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6365–6375. Online (2020)

  40. Yoon, S.W., Seo, J., Moon, J.: TapNet: neural network augmented with task-adaptive projection for few-shot learning. In: Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 97, pp. 7115–7123 (2019)

  41. Zhong, P., Wang, D., Miao, C.: Knowledge-enriched transformer for emotion detection in textual conversations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 165–176. Association for Computational Linguistics, Hong Kong (2019). https://doi.org/10.18653/v1/D19-1016

  42. Zhou, J.T., Zhang, H., Jin, D., Zhu, H., Fang, M., Goh, R.S.M., Kwok, K.: Dual adversarial neural transfer for low-resource named entity recognition. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp. 3461–3471 (2019)

Download references

Acknowledgements

We also acknowledge the editorial committee’s support and all anonymous reviewers for their insightful comments and suggestions, which improved the content and presentation of this manuscript.

Funding

This work was partially supported by National Key R&D Program of China No. 2020AAA0108800, NSFC under grants Nos. 62272469 and U19B2024.

Author information

Authors and Affiliations

Authors

Contributions

Zenghua Liao and Junbo Fei wrote the main manuscript text and designed the methodology framework. Weixin Zeng and Xiang Zhao prepared experiment. All authors reviewed the manuscript.

Corresponding author

Correspondence to Xiang Zhao.

Ethics declarations

Ethics approval and consent to participate

This declaration is not applicable.

Competing interests

I declare that all authors have no competing interests as defned by Springer, or other interests that might be perceived to infuence the results and discussion reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Knowledge-Graph-Enabled Methods and Applications for the Future Web Guest Editors: Xin Wang, Jeff Pan, Qingpeng Zhang, Yuan-Fang Li

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liao, Z., Fei, J., Zeng, W. et al. Few-shot named entity recognition with hybrid multi-prototype learning. World Wide Web 26, 2521–2544 (2023). https://doi.org/10.1007/s11280-023-01143-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-023-01143-5

Keywords

Navigation