Abstract
Named entity recognition (NER) is a very relevant task for text information retrieval in natural language processing (NLP) problems. Most recent state-of-the-art NER methods require humans to annotate and provide useful data for model training. However, using human power to identify, circumscribe and label entities manually can be very expensive in terms of time, money, and effort. This paper investigates the use of prompt-based language models (OpenAI’s GPT-3) and weak supervision in the legal domain. We apply both strategies as alternative approaches to the traditional human-based annotation method, relying on computer power instead human effort for labeling, and subsequently compare model performance between computer and human-generated data. We also introduce combinations of all three mentioned methods (prompt-based, weak supervision, and human annotation), aiming to find ways to maintain high model efficiency and low annotation costs. We showed that, despite human labeling still maintaining better overall performance results, the alternative strategies and their combinations presented themselves as valid options, displaying positive results and similar model scores at lower costs. Final results demonstrate preservation of human-trained models scores averaging 74.0% for GPT-3, 95.6% for weak supervision, 90.7% for GPT + weak supervision combination, and 83.9% for GPT + 30% human-labeling combination.
Similar content being viewed by others
Notes
The Critical Difference (CD) is a metric established at Demšar (2006), that determines if one or more learning algorithms, in a specific domain, are in fact statistically different or not. The CD value is calculated considering each algorithm’s results and their relative difference from each other. This value represents a critical difference threshold, that if surpassed by the relative difference between algorithms, makes it possible to affirm their statistical difference.
References
Bach SH, Rodriguez D, Liu Y et al (2019) Snorkel drybell: a case study in deploying weak supervision at industrial scale. In: Proceedings of the 2019 international conference on management of data, SIGMOD ’19. Association for Computing Machinery, New York, NY, USA, pp 362–375. https://doi.org/10.1145/3299869.3314036
Brown TB, Mann B, Ryder N et al (2020) Language models are few-shot learners. arXiv:2005.14165
Chowdhary K (2020) Natural language processing. In: Fundamentals of artificial intelligence. Springer, New Delhi, pp 603–649
Dai H, Song Y, Wang H (2021) Ultra-fine entity typing with weak supervision from a masked language model. arXiv:2106.04098
Dale R (2021) Gpt-3: what’s it good for? Nat Lang Eng 27(1):113–118. https://doi.org/10.1017/S1351324920000601
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Dozier C, Kondadadi R, Light M et al (2010) Named entity recognition and resolution in legal text. In: Semantic processing of legal texts. Springer, pp 27–43
Eddy SR (2004) What is a hidden Markov model? Nat Biotechnol 22(10):1315–1316
Floridi L, Chiriatti M (2020) Gpt-3: its nature, scope, limits, and consequences. Mind Mach 30(4):681–694
Fredriksson T, Mattos DI, Bosch J et al (2020) Data labeling: an empirical investigation into industrial challenges and mitigation strategies. In: Product-focused software process improvement: 21st international conference, PROFES 2020, Proceedings 21, Turin, Italy, November 25–27, 2020. Springer, pp 202–216
Giri R, Porwal Y, Shukla V et al (2017) Approaches for information retrieval in legal documents. In: 2017 tenth international conference on contemporary computing (IC3). IEEE, pp 1–6
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5):602–610. https://doi.org/10.1016/j.neunet.2005.06.042
Karamanolakis G, Mukherjee S, Zheng G et al (2021) Self-training with weak supervision. arXiv:2104.05514
Lison P, Hubin A, Barnes J et al (2020) Named entity recognition without labelled data: a weak supervision approach. arXiv:2004.14723
Lison P, Barnes J, Hubin A (2021) skweak: weak supervision made easy for NLP. arXiv preprint arXiv:2104.09683
Liu Y, Ott M, Goyal N et al (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Liu P, Yuan W, Fu J et al (2023) Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv 55(9):1–35. https://doi.org/10.1145/3560815
Luz de Araujo PH, de Campos TE, de Oliveira RR et al (2018) LeNER-Br: a dataset for named entity recognition in brazilian legal text. In: International conference on computational processing of the Portuguese language. Springer, pp 313–323
Maiya AS (2020) ktrain: a low-code library for augmented machine learning. arXiv preprint arXiv:2004.10703 [cs.LG]
Marrero M, Urbano J, Sánchez-Cuadrado S et al (2013) Named entity recognition: fallacies, challenges and opportunities. Comput Stand Interfaces 35(5):482–489
Meyer S, Elsweiler D, Ludwig B et al (2022) Do we still need human assessors? prompt-based gpt-3 user simulation in conversational ai. In: Proceedings of the 4th conference on conversational user interfaces, CUI ’22. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3543829.3544529,
Nasar Z, Jaffry SW, Malik MK (2021) Named entity recognition and relation extraction: state-of-the-art. ACM Comput Surv 54(1):1–39
Ratner A, Bach SH, Ehrenberg H et al (2020) Snorkel: rapid training data creation with weak supervision. VLDB J 29(2):709–730
Ratner AJ, De Sa CM, Wu S et al (2016) Data programming: creating large training sets, quickly. Advances in neural information processing systems 29
Sakhaee N, Wilson MC (2021) Information extraction framework to build legislation network. Artif Intell Law 29(1):35–58
Smith LN (2015) Cyclical learning rates for training neural networks. arXiv:1506.01186
Souza F, Nogueira R, Lotufo R (2020) BERTimbau: pretrained BERT models for Brazilian Portuguese. In: Brazilian conference on intelligent systems. Springer, pp 403–417
Sun C, Qiu X, Xu Y et al (2019) How to fine-tune bert for text classification? In: China national conference on Chinese computational linguistics. Springer, Cham, pp 194–206
Torfi A, Shirvani RA, Keneshloo Y et al (2020) Natural language processing advancements by deep learning: a survey. arXiv preprint arXiv:2003.01200
Vardhan H, Surana N, Tripathy B (2021) Named-entity recognition for legal documents. In: International conference on advanced machine learning technologies and applications. Springer, pp 469–479
Vasiliev Y (2020) Natural Language processing with Python and SpaCy: a practical introduction. No Starch Press, San Francisco
Wang S, Liu Y, Xu Y et al (2021) Want to reduce labeling cost? GPT-3 can help. arXiv:2108.13487
Wang S, Sun X, Li X et al (2023) Gpt-ner: named entity recognition via large language models. arXiv:2304.10428
Wei X, Cui X, Cheng N et al (2023) Zero-shot information extraction via chatting with chatgpt. arXiv:2302.10205
Zamani H, Croft WB (2018) On the theory of weak supervision for information retrieval. In: Proceedings of the 2018 ACM SIGIR international conference on theory of information retrieval, ICTIR ’18. Association for Computing Machinery, New York, NY, USA, pp 147–154. https://doi.org/10.1145/3234944.3234968
Zhang S, He L, Dragut E et al (2019) How to invest my time: Lessons from human-in-the-loop entity extraction. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2305–2313
Zhou ZH (2018) A brief introduction to weakly supervised learning. Natl Sci Rev 5(1):44–53
Acknowledgements
The authors would like to thank Fundação de Apoio à Pesquisa do Distrito Federal (FAPDF), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES), Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP - process number 2023/10100-4), and project KnEDLe-UNB.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Oliveira, V., Nogueira, G., Faleiros, T. et al. Combining prompt-based language models and weak supervision for labeling named entity recognition on legal documents. Artif Intell Law (2024). https://doi.org/10.1007/s10506-023-09388-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s10506-023-09388-1