Few-shot named entity recognition framework for forestry science metadata extraction

Fan, Yuquan; Xiao, Hong; Wang, Min; Wang, Junchi; Jiang, Wenchao; Zhu, Chang

doi:10.1007/s12652-023-04740-4

Few-shot named entity recognition framework for forestry science metadata extraction

Original Research
Published: 01 February 2024

Volume 15, pages 2105–2118, (2024)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Yuquan Fan¹,
Hong Xiao¹,
Min Wang²,
Junchi Wang¹,
Wenchao Jiang ORCID: orcid.org/0000-0002-6300-1962¹ &
…
Chang Zhu¹

100 Accesses
Explore all metrics

Abstract

The effective utilization of accumulated forestry science papers is of paramount significance in enhancing our understanding of the current state of forests and the formulation of strategies for forest environmental preservation. However, the present challenge lies in the deficient richness of metadata associated with these pivotal documents, rendering their comprehensive exploitation a formidable endeavor. Metadata from forestry science papers serves as a foundational cornerstone for the efficient management and utilization of these scholarly documents, playing an indispensable role in the advancement of research within the domain of forestry science. Constructing a training corpus and extracting distant semantic relationships is challenging inherent, the utilization of named entity recognition (NER) technology for metadata entity identification in forestry science papers remains an unexplored avenue. To overcome these limitations, this paper creates a specialized training corpus and introduces a novel few-shot NER framework tailored specifically for metadata extraction from forestry science papers. Within this innovative framework, a data augmentation layer, employing word replacement (WR) and enhanced mixup (EM), effectively addresses the issue of suboptimal performance resulting from a scarcity of training data. The semantic comprehension layer incorporates a multi-granularity dilated convolution neural network (MGDCNN) to capture and extract distant semantic associations. Moreover, a meta-learning-based reweighting layer is introduced to mitigate the adverse effects of low-quality augmented examples on the model. Experimental results conclusively demonstrate the efficacy of the proposed framework, yielding precision, recall, and F1 of 91.08%, 88.96%, and 90.00%, respectively. Compared to traditional models, precision, recall, and F1 can be improved by up to 10.69%, 7.48%, and 9.07%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

Benchmarking Named Entity Recognition Approaches for Extracting Research Infrastructure Information from Text

A multiscale convolutional gragh network using only structural information for entity alignment

Article 25 July 2022

Few-Shot NER in Marine Ecology Using Deep Learning

References

Dai X, Adel H (2020) An analysis of simple data augmentation for named entity recognition. arXiv:2010.11683
Dai Z, Yang Z, Yang Y, Carbonell J, Le Quoc V, Salakhutdinov R (2019) Transformer-xl: attentive language models beyond a fixed-length context. arXiv:1901.02860
Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacL-HLT, vol 1, p 2
Dongmei LI, Wen TAN (2019) Research on named entity recognition method in plant attribute text. J Front Comput Sci Technol 13(12):2085
Google Scholar
Du H (2020) Research and construction of a forestry law and regulation q &a system integrating knowledge graph. Beijing Forestry University
Gong Y, Mao L, Changliang L (2021) Few-shot learning for named entity recognition based on bert and two-level model fusion. Data Intell 3(4):568–577
Article Google Scholar
Guo H, Mao Y, Zhang R (2019) Augmenting data with mixup for sentence classification: an empirical study. arXiv:1905.08941
Huang Z, Xu W, Yu K (2015) Bidirectional lstm-crf models for sequence tagging. arXiv:1508.01991
Ji P, Xiao Y, Hou R (2019) Exploration and practice of forestry science data management. J Agric Big Data 1(03):46–56
Google Scholar
Jing S (2022) Thoughts and countermeasures on strengthening scientific data management in the era of big data. China Soft Sci 09:50–54
Google Scholar
Kang Y, Sun L, Zhu R, Li M (2022) A review of deep learning chinese named entity recognition research. J Huazhong Univ Sci Technol (Natural Science Edition) 50(11)
Ke J, Wang W, Chen X, Gou J, Gao Y, Jin S (2023) Medical entity recognition and knowledge map relationship analysis of Chinese emrs based on improved bilstm-crf. Comput Electr Eng 108:108709
Article Google Scholar
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) Albert: a lite bert for self-supervised learning of language representations
Lee C-S, Wang M-H, Reformat M, Huang S-H (2023) Human intelligence-based metaverse for co-learning of students and smart machines. J Ambient Intell Humaniz Comput 14(6):7695–7718
Article Google Scholar
Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. Association for Computational Linguistics, pp 1064–1074
Patil NV, Patil AS, Pawar BV (2017) Hmm based named entity recognition for inflectional language, pp 565–572
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Qian H, Liu N, Wang J, Zhichao W, Zhang X, Liu Q, Zhao Y, Feng X (2021) An overlapping sequence tagging mechanism for symptoms and details extraction on Chinese medical records. Comput Electr Eng 91:107019
Article Google Scholar
Ramachandran R, Arutchelvan K (2021) Named entity recognition on bio-medical literature documents using hybrid based approach. J Ambient Intell Humaniz Comput 1–10
Ren M, Zeng W, Yang B, Urtasun R (2018) Learning to reweight examples for robust deep learning. In: International conference on machine learning. PMLR, pp 4334–4343
Rubí JNS, de Carvalho PHP, Gondim PRL (2022) Forestry 4.0 and industry 4.0: use case on wildfire behavior predictions. Comput Electric Eng 102:108200
Article Google Scholar
Ruidan Wang, Jing Yang, Menxu Gao, Wang C (2018) Reflections on strengthening and standardizing scientific data management in china. China Sci Technol Resour Guide 50(02):1–5
Google Scholar
Sundheim BM (1995) Named entity task definition, version2.1. In: Proc. sixth message understanding conf. (MUC-6)
Sun Y, Wang S, Li Y, Feng S, Tian H, Hua W, Wang H (2020) Ernie 2.0: a continual pre-training framework for language understanding. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8968–8975
Sun Y, Wang S, Feng S, Ding S, Pang C, Shang J, Liu J, Chen X, Zhao Y, Lu Y, Liu W, Wu Z, Gong W, Liang J, Shang Z, Sun P, Liu W, Ouyang X, Yu D, Tian H, Wu H, Wang H (2021) Ernie 3.0: large-scale knowledge enhanced pre-training for language understanding and generation. arXiv:2107.02137
Wang Q, Xiyou S (2022) Research on named entity recognition methods in Chinese forest disease texts. Appl Sci 12(8):3885
Article Google Scholar
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: beyond empirical risk minimization. arXiv:1710.09412
Zhang L, Nie X, Zhang M, Gu M, Geissen V, Ritsema CJ, Niu D, Zhang H (2022) Lexicon and attention-based named entity recognition for kiwifruit diseases and pests: a deep learning approach. Front Plant Sci 13:1053449
Article Google Scholar
Zhang Y, Pu P, Huang L, Qian B, Liu Y (2023) Chinese named entity recognition of apple diseases and pests based on iterative dilated convolution, pp 1810–1815
Zhao P, Wang W, Liu H, Han M (2022) Recognition of the agricultural named entities with multifeature fusion based on albert. IEEE Access 10:98936–98943
Article Google Scholar
Zhu H, Yang L, Ding W (2018) Chinese weibo named entity recognition based on topic tags and crf. J Central China Normal Univ (Natural Science Edition)

Download references

Acknowledgements

This study was funded by Guangdong Basic and Applied Basic Research Fund Project (Grant/ Award Number 2020B1515120010), Key Technology Project of Foshan City (Grant/Award Number 1920001001367), Guangdong Science and Technology Plan Project (Grant/Award Number 2019B010139001), Guangdong Natural Science Fund Project (Grant/Award Number 2021A1515011243), and Guangzhou Science and Technology Plan Project (Grant/Award Number 201902020016).

Author information

Authors and Affiliations

School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, China
Yuquan Fan, Hong Xiao, Junchi Wang, Wenchao Jiang & Chang Zhu
Chinamobile Park Construction and Development Company, Beijing, 102206, China
Min Wang

Authors

Yuquan Fan
View author publications
You can also search for this author in PubMed Google Scholar
Hong Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Min Wang
View author publications
You can also search for this author in PubMed Google Scholar
Junchi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wenchao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Chang Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenchao Jiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fan, Y., Xiao, H., Wang, M. et al. Few-shot named entity recognition framework for forestry science metadata extraction. J Ambient Intell Human Comput 15, 2105–2118 (2024). https://doi.org/10.1007/s12652-023-04740-4

Download citation

Received: 06 June 2023
Accepted: 08 December 2023
Published: 01 February 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s12652-023-04740-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Few-shot named entity recognition framework for forestry science metadata extraction

Abstract

Access this article

Similar content being viewed by others

Benchmarking Named Entity Recognition Approaches for Extracting Research Infrastructure Information from Text

A multiscale convolutional gragh network using only structural information for entity alignment

Few-Shot NER in Marine Ecology Using Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Few-shot named entity recognition framework for forestry science metadata extraction

Abstract

Access this article

Similar content being viewed by others

Benchmarking Named Entity Recognition Approaches for Extracting Research Infrastructure Information from Text

A multiscale convolutional gragh network using only structural information for entity alignment

Few-Shot NER in Marine Ecology Using Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation