Could KeyWord Masking Strategy Improve Language Model?

Borovikova, Mariya; Ferré, Arnaud; Bossy, Robert; Roche, Mathieu; Nédellec, Claire

doi:10.1007/978-3-031-35320-8_19

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13913))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

977 Accesses
1 Citations

Abstract

This paper presents an enhanced approach for adapting a Language Model (LM) to a specific domain, with a focus on Named Entity Recognition (NER) and Named Entity Linking (NEL) tasks. Traditional NER/NEL methods require a large amounts of labeled data, which is time and resource intensive to produce. Unsupervised and semi-supervised approaches overcome this limitation but suffer from a lower quality. Our approach, called KeyWord Masking (KWM), fine-tunes a Language Model (LM) for the Masked Language Modeling (MLM) task in a special way. Our experiments demonstrate that KWM outperforms traditional methods in restoring domain-specific entities. This work is a preliminary step towards developing a more sophisticated NER/NEL system for domain-specific data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://entrepot.recherche.data.gouv.fr/dataset.xhtml?persistentId=doi%3A10.57745%2FHVPITE.

References

BEYOND: Building epidemiological surveillance & prophylaxis with observations near & distant. https://www6.inrae.fr/beyond/. Accessed 06 Feb 2023
GeoNames. https://gd.eppo.int/. Accessed 06 Feb 2023
PESV. https://gd.eppo.int/. Accessed 06 Feb 2023
EPPO (2023). EPPO Global Database (available online). https://plateforme-esv.fr. Accessed 06 Feb 2023
Ayoola, T., Fisher, J., Pierleoni, A.: Improving entity disambiguation by reasoning over a knowledge base. arXiv preprint arXiv:2207.04106 (2022)
Baevski, A., Edunov, S., Liu, Y., Zettlemoyer, L., Auli, M.: Cloze-driven pretraining of self-attention networks. arXiv preprint arXiv:1903.07785 (2019)
Bossy, R., Deléger, L., Chaix, E., Ba, M., Nédellec, C.: Bacteria Biotope 2019 (2022). https://doi.org/10.57745/PCQFC2
Chen, X., et al.: One model for all domains: collaborative domain-prefix tuning for cross-domain NER. arXiv preprint arXiv:2301.10410 (2023)
Derczynski, L., Llorens, H., Saquete, E.: Massively increasing TIMEX3 resources: a transduction approach. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, pp. 3754–3761. European Language Resources Association (ELRA) (2012). http://www.lrec-conf.org/proceedings/lrec2012/pdf/451_Paper.pdf
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota (Volume 1: Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1423. https://aclanthology.org/N19-1423
Ferré, A., Deléger, L., Bossy, R., Zweigenbaum, P., Nédellec, C.: C-norm: a neural approach to few-shot entity normalization. BMC Bioinform. 21(23), 1–19 (2020)
Google Scholar
Fritzler, A., Logacheva, V., Kretov, M.: Few-shot classification in named entity recognition task. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 993–1000 (2019)
Google Scholar
Gligic, L., Kormilitzin, A., Goldberg, P., Nevado-Holgado, A.: Named entity recognition in electronic health records using transfer learning bootstrapped neural networks. Neural Netw. 121, 132–139 (2020)
Article Google Scholar
Gritta, M., Pilehvar, M.T., Collier, N.: Which Melbourne? Augmenting geocoding with maps. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1285–1296 (2018)
Google Scholar
Imambi, S., Prakash, K.B., Kanagachidambaresan, G.: PyTorch. In: Programming with TensorFlow: Solution for Edge Computing Applications, pp. 87–104 (2021)
Google Scholar
Iovine, A., Fang, A., Fetahu, B., Rokhlenko, O., Malmasi, S.: CycleNER: an unsupervised training approach for named entity recognition. In: Proceedings of the ACM Web Conference 2022, pp. 2916–2924 (2022)
Google Scholar
Jia, C., Liang, X., Zhang, Y.: Cross-domain NER using cross-domain language modeling. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2464–2474 (2019)
Google Scholar
Jiang, S., Cormier, S., Angarita, R., Rousseaux, F.: Improving text mining in plant health domain with GAN and/or pre-trained language model. Front. Artif. Intell. 6 (2023)
Google Scholar
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)
Article Google Scholar
Li, J., Chiu, B., Feng, S., Wang, H.: Few-shot named entity recognition via meta-learning. IEEE Trans. Knowl. Data Eng. 34(9), 4245–4256 (2022). https://doi.org/10.1109/TKDE.2020.3038670
Article Google Scholar
Li, X., et al.: Effective few-shot named entity linking by meta-learning. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 178–191. IEEE (2022)
Google Scholar
Liu, Z., Jiang, F., Hu, Y., Shi, C., Fung, P.: NER-BERT: a pre-trained model for low-resource entity tagging. arXiv preprint arXiv:2112.00405 (2021)
Liu, Z., et al.: CrossNER: evaluating cross-domain named entity recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 13452–13460 (2021)
Google Scholar
Ming, H., Yang, J., Jiang, L., Pan, Y., An, N.: Few-shot nested named entity recognition. arXiv e-prints, pp. arXiv-2212 (2022)
Google Scholar
Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: BioNLP 2019, p. 319 (2019)
Google Scholar
Pergola, G., Kochkina, E., Gui, L., Liakata, M., He, Y.: Boosting low-resource biomedical QA via entity-aware masking strategies. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 1977–1985 (2021)
Google Scholar
Peters, M.E., Ammar, W., Bhagavatula, C., Power, R.: Semi-supervised sequence tagging with bidirectional language models. arXiv preprint arXiv:1705.00108 (2017)
Popovski, G., Kochev, S., Korousic-Seljak, B., Eftimov, T.: Foodie: a rule-based named-entity recognition method for food information extraction. ICPRAM 12, 915 (2019)
Google Scholar
Raiman, J., Raiman, O.: Deeptype: multilingual entity linking by neural type system evolution. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
van Rossum, G.: Python programming language. 3(8), 15 (2022). https://www.python.org/downloads/release/python-3815/. Accessed 06 Feb 2023
Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. arXiv preprint cs/0306050 (2003)
Google Scholar
Schoch, C., et al.: NCBI taxonomy: a comprehensive update on curation, resources and tools. Database 2020 (2020). https://doi.org/10.1093/database/baaa062. https://www.ncbi.nlm.nih.gov/taxonomy. Accessed 06 Feb 2023
Ushio, A., Camacho-Collados, J.: T-NER: an all-round python library for transformer-based named entity recognition. arXiv preprint arXiv:2209.12616 (2022)
Wang, C., Sun, X., Yu, H., Zhang, W.: Entity disambiguation leveraging multi-perspective attention. IEEE Access 7, 113963–113974 (2019)
Article Google Scholar
Wang, S., et al.: \(k\) NN-NER: named entity recognition with nearest neighbor search. arXiv preprint arXiv:2203.17103 (2022)
Wettig, A., Gao, T., Zhong, Z., Chen, D.: Should you mask 15% in masked language modeling? arXiv preprint arXiv:2202.08005 (2022)
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics, Online (2020). https://www.aclweb.org/anthology/2020.emnlp-demos.6
Xu, J., Gan, L., Cheng, M., Wu, Q.: Unsupervised medical entity recognition and linking in Chinese online medical text. J. Healthcare Eng. 2018 (2018)
Google Scholar
Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6442–6454. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-main.523. https://aclanthology.org/2020.emnlp-main.523
Yamada, I., Washio, K., Shindo, H., Matsumoto, Y.: Global entity disambiguation with BERT. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3264–3271 (2022)
Google Scholar
Zhang, S., Elhadad, N.: Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. J. Biomed. Inform. 46(6), 1088–1098 (2013)
Article Google Scholar

Download references

Acknowledgements

The authors would like to express their sincere gratitude to the ANR-20-PCPA-0002, BEYOND [1] for providing the funding that made this research possible.

Author information

Authors and Affiliations

MaIAGE, Université Paris-Saclay, INRAE, Domaine de Vilvert, 78352, Jouy-en-Josas, France
Mariya Borovikova, Arnaud Ferré, Robert Bossy & Claire Nédellec
CIRAD, 34398, Montpellier, France
Mathieu Roche
TETIS, Univ. Montpellier, AgroParisTech, CIRAD, CNRS, INRAE, 34090, Montpellier, France
Mariya Borovikova & Mathieu Roche

Authors

Mariya Borovikova
View author publications
You can also search for this author in PubMed Google Scholar
Arnaud Ferré
View author publications
You can also search for this author in PubMed Google Scholar
Robert Bossy
View author publications
You can also search for this author in PubMed Google Scholar
Mathieu Roche
View author publications
You can also search for this author in PubMed Google Scholar
Claire Nédellec
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mariya Borovikova .

Editor information

Editors and Affiliations

Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Derby, Derby, UK
Farid Meziane
Oakland University, Rochester, NY, USA
Vijayan Sugumaran
University of Derby, Derby, UK
Warren Manning
University of Derby, Derby, UK
Stephan Reiff-Marganiec

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Borovikova, M., Ferré, A., Bossy, R., Roche, M., Nédellec, C. (2023). Could KeyWord Masking Strategy Improve Language Model?. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds) Natural Language Processing and Information Systems. NLDB 2023. Lecture Notes in Computer Science, vol 13913. Springer, Cham. https://doi.org/10.1007/978-3-031-35320-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-35320-8_19
Published: 14 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35319-2
Online ISBN: 978-3-031-35320-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics