Skip to main content

Could KeyWord Masking Strategy Improve Language Model?

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2023)

Abstract

This paper presents an enhanced approach for adapting a Language Model (LM) to a specific domain, with a focus on Named Entity Recognition (NER) and Named Entity Linking (NEL) tasks. Traditional NER/NEL methods require a large amounts of labeled data, which is time and resource intensive to produce. Unsupervised and semi-supervised approaches overcome this limitation but suffer from a lower quality. Our approach, called KeyWord Masking (KWM), fine-tunes a Language Model (LM) for the Masked Language Modeling (MLM) task in a special way. Our experiments demonstrate that KWM outperforms traditional methods in restoring domain-specific entities. This work is a preliminary step towards developing a more sophisticated NER/NEL system for domain-specific data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://entrepot.recherche.data.gouv.fr/dataset.xhtml?persistentId=doi%3A10.57745%2FHVPITE.

References

  1. BEYOND: Building epidemiological surveillance & prophylaxis with observations near & distant. https://www6.inrae.fr/beyond/. Accessed 06 Feb 2023

  2. GeoNames. https://gd.eppo.int/. Accessed 06 Feb 2023

  3. PESV. https://gd.eppo.int/. Accessed 06 Feb 2023

  4. EPPO (2023). EPPO Global Database (available online). https://plateforme-esv.fr. Accessed 06 Feb 2023

  5. Ayoola, T., Fisher, J., Pierleoni, A.: Improving entity disambiguation by reasoning over a knowledge base. arXiv preprint arXiv:2207.04106 (2022)

  6. Baevski, A., Edunov, S., Liu, Y., Zettlemoyer, L., Auli, M.: Cloze-driven pretraining of self-attention networks. arXiv preprint arXiv:1903.07785 (2019)

  7. Bossy, R., Deléger, L., Chaix, E., Ba, M., Nédellec, C.: Bacteria Biotope 2019 (2022). https://doi.org/10.57745/PCQFC2

  8. Chen, X., et al.: One model for all domains: collaborative domain-prefix tuning for cross-domain NER. arXiv preprint arXiv:2301.10410 (2023)

  9. Derczynski, L., Llorens, H., Saquete, E.: Massively increasing TIMEX3 resources: a transduction approach. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, pp. 3754–3761. European Language Resources Association (ELRA) (2012). http://www.lrec-conf.org/proceedings/lrec2012/pdf/451_Paper.pdf

  10. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota (Volume 1: Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1423. https://aclanthology.org/N19-1423

  11. Ferré, A., Deléger, L., Bossy, R., Zweigenbaum, P., Nédellec, C.: C-norm: a neural approach to few-shot entity normalization. BMC Bioinform. 21(23), 1–19 (2020)

    Google Scholar 

  12. Fritzler, A., Logacheva, V., Kretov, M.: Few-shot classification in named entity recognition task. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 993–1000 (2019)

    Google Scholar 

  13. Gligic, L., Kormilitzin, A., Goldberg, P., Nevado-Holgado, A.: Named entity recognition in electronic health records using transfer learning bootstrapped neural networks. Neural Netw. 121, 132–139 (2020)

    Article  Google Scholar 

  14. Gritta, M., Pilehvar, M.T., Collier, N.: Which Melbourne? Augmenting geocoding with maps. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1285–1296 (2018)

    Google Scholar 

  15. Imambi, S., Prakash, K.B., Kanagachidambaresan, G.: PyTorch. In: Programming with TensorFlow: Solution for Edge Computing Applications, pp. 87–104 (2021)

    Google Scholar 

  16. Iovine, A., Fang, A., Fetahu, B., Rokhlenko, O., Malmasi, S.: CycleNER: an unsupervised training approach for named entity recognition. In: Proceedings of the ACM Web Conference 2022, pp. 2916–2924 (2022)

    Google Scholar 

  17. Jia, C., Liang, X., Zhang, Y.: Cross-domain NER using cross-domain language modeling. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2464–2474 (2019)

    Google Scholar 

  18. Jiang, S., Cormier, S., Angarita, R., Rousseaux, F.: Improving text mining in plant health domain with GAN and/or pre-trained language model. Front. Artif. Intell. 6 (2023)

    Google Scholar 

  19. Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)

    Article  Google Scholar 

  20. Li, J., Chiu, B., Feng, S., Wang, H.: Few-shot named entity recognition via meta-learning. IEEE Trans. Knowl. Data Eng. 34(9), 4245–4256 (2022). https://doi.org/10.1109/TKDE.2020.3038670

    Article  Google Scholar 

  21. Li, X., et al.: Effective few-shot named entity linking by meta-learning. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 178–191. IEEE (2022)

    Google Scholar 

  22. Liu, Z., Jiang, F., Hu, Y., Shi, C., Fung, P.: NER-BERT: a pre-trained model for low-resource entity tagging. arXiv preprint arXiv:2112.00405 (2021)

  23. Liu, Z., et al.: CrossNER: evaluating cross-domain named entity recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 13452–13460 (2021)

    Google Scholar 

  24. Ming, H., Yang, J., Jiang, L., Pan, Y., An, N.: Few-shot nested named entity recognition. arXiv e-prints, pp. arXiv-2212 (2022)

    Google Scholar 

  25. Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: BioNLP 2019, p. 319 (2019)

    Google Scholar 

  26. Pergola, G., Kochkina, E., Gui, L., Liakata, M., He, Y.: Boosting low-resource biomedical QA via entity-aware masking strategies. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 1977–1985 (2021)

    Google Scholar 

  27. Peters, M.E., Ammar, W., Bhagavatula, C., Power, R.: Semi-supervised sequence tagging with bidirectional language models. arXiv preprint arXiv:1705.00108 (2017)

  28. Popovski, G., Kochev, S., Korousic-Seljak, B., Eftimov, T.: Foodie: a rule-based named-entity recognition method for food information extraction. ICPRAM 12, 915 (2019)

    Google Scholar 

  29. Raiman, J., Raiman, O.: Deeptype: multilingual entity linking by neural type system evolution. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  30. van Rossum, G.: Python programming language. 3(8), 15 (2022). https://www.python.org/downloads/release/python-3815/. Accessed 06 Feb 2023

  31. Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. arXiv preprint cs/0306050 (2003)

    Google Scholar 

  32. Schoch, C., et al.: NCBI taxonomy: a comprehensive update on curation, resources and tools. Database 2020 (2020). https://doi.org/10.1093/database/baaa062. https://www.ncbi.nlm.nih.gov/taxonomy. Accessed 06 Feb 2023

  33. Ushio, A., Camacho-Collados, J.: T-NER: an all-round python library for transformer-based named entity recognition. arXiv preprint arXiv:2209.12616 (2022)

  34. Wang, C., Sun, X., Yu, H., Zhang, W.: Entity disambiguation leveraging multi-perspective attention. IEEE Access 7, 113963–113974 (2019)

    Article  Google Scholar 

  35. Wang, S., et al.: \(k\) NN-NER: named entity recognition with nearest neighbor search. arXiv preprint arXiv:2203.17103 (2022)

  36. Wettig, A., Gao, T., Zhong, Z., Chen, D.: Should you mask 15% in masked language modeling? arXiv preprint arXiv:2202.08005 (2022)

  37. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics, Online (2020). https://www.aclweb.org/anthology/2020.emnlp-demos.6

  38. Xu, J., Gan, L., Cheng, M., Wu, Q.: Unsupervised medical entity recognition and linking in Chinese online medical text. J. Healthcare Eng. 2018 (2018)

    Google Scholar 

  39. Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6442–6454. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-main.523. https://aclanthology.org/2020.emnlp-main.523

  40. Yamada, I., Washio, K., Shindo, H., Matsumoto, Y.: Global entity disambiguation with BERT. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3264–3271 (2022)

    Google Scholar 

  41. Zhang, S., Elhadad, N.: Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. J. Biomed. Inform. 46(6), 1088–1098 (2013)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to express their sincere gratitude to the ANR-20-PCPA-0002, BEYOND [1] for providing the funding that made this research possible.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariya Borovikova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Borovikova, M., Ferré, A., Bossy, R., Roche, M., Nédellec, C. (2023). Could KeyWord Masking Strategy Improve Language Model?. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds) Natural Language Processing and Information Systems. NLDB 2023. Lecture Notes in Computer Science, vol 13913. Springer, Cham. https://doi.org/10.1007/978-3-031-35320-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-35320-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-35319-2

  • Online ISBN: 978-3-031-35320-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics