Abstract
The contract review process can be a costly and time-consuming task for lawyers and clients alike, requiring significant effort to identify and evaluate the legal implications of individual clauses. To address this challenge, we propose the use of natural language processing techniques, specifically text classification based on deontic tags, to streamline the process. Our research question is whether natural language processing techniques, specifically dense vector embeddings, can help semi-automate the contract review process and reduce time and costs for legal professionals reviewing deontic modalities in contracts. In this study, we create a domain-specific dataset and train both baseline and neural network models for contract sentence classification. This approach offers a more efficient and cost-effective solution for contract review, mimicking the work of a lawyer. Our approach achieves an accuracy of 0.90, showcasing its effectiveness in identifying and evaluating individual contract sentences.
Similar content being viewed by others
References
Aires JP, Pinheiro D, Strube de Lima V, Meneguzzi F (2017) Norm conflict identification in contracts. Artif Intell Law 2017(25):397–428. https://doi.org/10.1007/s10506-017-9205-x
Aseervatham S, Antoniadis A, Gaussier E et al (2011) A sparse version of the ridge logistic regression for large-scale text categorization. Pattern Recogn Lett 32(2):101–106. https://doi.org/10.1016/j.patrec.2010.09.023
Baker K, Bloodgood M, Dorr B et al (2014) A modality lexicon and its use in automatic tagging. https://doi.org/10.48550/arXiv.1410.4868
Boella G, Di Caro L, Leone V (2019) Semi-automatic knowledge population in a legal document management system. Artif Intell Law 27:227–251. https://doi.org/10.1007/s10506-018-9239-8
BOWCOTT O (2016) Legal fees investigation reveals huge disparities between law firms. The Guardian, April 5, 2016 [viewed on 06 July 2022]. Available from: https://www.theguardian.com/law/2016/apr/05/legal-fees-nvestigation-reveals-huge-disparities-between-law-firms
Chalkidis I, Kampas D (2019) Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artif Intell Law 27:171–198. https://doi.org/10.1007/s10506-018-9238-9
DIAMOND J (2016) The Price of Law. Centre for Policy Studies [viewed 06 July 2022]. Available from: https://cps.org.uk/research/the-price-of-law/
Hendrycks D, Burns C, Chen A, Ball S (2021) CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review. 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks. https://doi.org/10.48550/arXiv.2103.06268
Hilpinen R (1971) Deontic Logic: Introductory and Systematic Readings. D. Reidel Publishing Company (pp 1–10)
Joachims T (1998). Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026683
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) 1746–1751. https://doi.org/10.48550/arXiv.1408.5882
Lippi M, Palka P, Contissa G et al (2019) (2019) CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. Artif Intell Law 27:117–139. https://doi.org/10.1007/s10506-019-09243-2
Matulewska A (2017) Deontic modality and modals in the language of contracts. Comparat Legilinguistics 2:75–92. https://doi.org/10.14746/CL.2010.2.07
Nay J (2018) Natural language and machine learning for law and policy texts. In: Katz DM, Dolin R, Bommarito M (eds), Legal Informatics. Cambridge University Press. https://ssrn.com/abstract=3438276
Nazarenko A, Wyner A (2018) Legal NLP Introduction. TAL, 58(2):7–19
O’Neill J, Buitelaar P, Robin C, O’Brien L (2017) Classifying sentential modality in legal language: a use case in financial regulations, acts and directives. In: Proceedings of the 16th International Conference on Artificial Intelligence and Law, London, UK, June 12–15, 2017, pp 159–168
Tuggener D, von Daniken P, Peetz T, Cieliebak M (2020) LEDGAR: A large-scale multilabel corpus for text classification of legal provision in contracts. Proceedings of the 12th Conference on Language Resources and Evaluation, 11–16 May 2020, pp 1235–1241
Waltl B, Bonczek G, Scepankova E, Matthes F (2019) Semantic types of legal norms in German laws: classification and analysis using local linear explanations. Artif Intell Law 27:43–71. https://doi.org/10.1007/s10506-018-9228-y
Wyner A, Peters W (2011) On rule extraction from regulations. In: Atkinson KM (eds), Frontiers in Artificial Intelligence and Applications Volume 235: Legal Knowledge and Information Systems, pp 113–122. https://doi.org/10.3233/978-1-60750-981-3-113
Acknowledgements
We wish to thank the following individuals who took time out of their busy schedules to manually annotate contracts: Andrene Hutchinson, K. Teddison Maye-Jackson, Odane Lennon, Ryan Gordon and Tishanna Maxwell.
Funding
No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there are no financial or non-financial interests that are directly or indirectly related to the work submitted for publication.
Ethics approval
No ethics approval is required for this study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Graham, S.G., Soltani, H. & Isiaq, O. Natural language processing for legal document review: categorising deontic modalities in contracts. Artif Intell Law (2023). https://doi.org/10.1007/s10506-023-09379-2
Accepted:
Published:
DOI: https://doi.org/10.1007/s10506-023-09379-2