Skip to main content

Text Simplification System for Legal Contract Review

  • Conference paper
  • First Online:
Advances in Information and Communication (FICC 2024)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 919))

Included in the following conference series:

  • 139 Accesses

Abstract

People tend to avoid reading contracts due to their complexity. As such, this research tackles the challenge of improving the accessibility and readability of contracts, which are often lengthy and difficult to understand. This study proposes a system that integrates automated contract review with text simplification. The system leverages a language model fine-tuned on legal data to extract salient clauses from contracts. Complex words are then replaced with simpler alternatives generated by language models trained on legal documents. Moreover, the simplified output is further refined by breaking down the text into shorter sentences based on their semantic hierarchy. Initial results show that the readability of the simplified contracts improved, making them understandable for 10th graders instead of requiring a postgraduate level of education. Human evaluations were generally positive, although the observed improvements were relatively minor. The research concludes that integrating text simplification with automated contract review has the potential to enhance contract readability, but more research is necessary to improve the quality of simplification further.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al-Thanyyan, S.S., Azmi, A.M.: Automated text simplification: a survey. ACM Comput. Surv. (CSUR) 54(2), 1–36 (2021)

    Google Scholar 

  2. Angelidis, I., Chalkidis, I., Koubarakis, M.: Named entity recognition, linking and generation for Greek legislation. In: JURIX, pp. 1–10 (2018)

    Google Scholar 

  3. Bakos, Y., Marotta-Wurgler, F., Trossen, D.R.: Does anyone read the fine print? Consumer attention to standard-form contracts. J. Legal Stud. 43(1), 1–35 (2014)

    Google Scholar 

  4. Benoliel, U., Becher, S.I.: The duty to read the unreadable. BCL Rev. 60, 2255 (2019)

    Google Scholar 

  5. Bernstam, E.V., Shelton, D.M., Walji, M., Meric-Bernstam, F.: Instruments to assess the quality of health information on the world wide web: what can our patients actually use? Int. J. Med. Inform. 74(1), 13–19 (2005)

    Google Scholar 

  6. Blackwell, A.H.: The Essential Law Dictionary. Sphinx Dictionaries. Sphinx Pub. (2008)

    Google Scholar 

  7. Bott, S., Saggion, H.: An unsupervised alignment algorithm for text simplification corpus construction. In: Proceedings of the Workshop on Monolingual Text-To-Text Generation, pp. 20–26 (2011)

    Google Scholar 

  8. Brysbaert, M.: New, Boris, Keuleers, Emmanuel: adding part-of-speech information to the subtlex-us word frequencies. Behav. Res. Methods 44, 991–997 (2012)

    Article  Google Scholar 

  9. Cakebread, C.: You’re not alone, no one reads terms of service agreements. Insider (2017)

    Google Scholar 

  10. Cardellino, C., Teruel, M., Alemany, L.A., Villata, S.: Legal NERC with ontologies, Wikipedia and curriculum learning. In: 15th European Chapter of the Association for Computational Linguistics (EACL 2017), pp. 254–259 (2017)

    Google Scholar 

  11. Carroll, J., Minnen, G., Canning, Y., Devlin, S., Tait, J.: Practical simplification of English newspaper text to assist aphasic readers. In: Proceedings of the AAAI-98 Workshop on Integrating Artificial Intelligence and Assistive Technology, pp. 7–10. Citeseer (1998)

    Google Scholar 

  12. Cemri, M., Çukur, T., Koç, A.: Unsupervised simplification of legal texts. arXiv preprint arXiv:2209.00557 (2022)

  13. Chalkidis, I., Androutsopoulos, I., Michos, A.: Extracting contract elements. In: Proceedings of the 16th Edition of the International Conference on Artificial Intelligence and Law, pp. 19–28 (2017)

    Google Scholar 

  14. Chalkidis, I., Androutsopoulos, I., Michos, A.: Obligation and prohibition extraction using hierarchical RNNs. arXiv preprint arXiv:1805.03871 (2018)

  15. Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N., Androutsopoulos, I.: Legal-BERT: the muppets straight out of law school. arXiv preprint arXiv:2010.02559 (2020)

  16. Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Androutsopoulos, I.: Neural contract element extraction revisited. In: Workshop on Document Intelligence at NeurIPS 2019 (2019)

    Google Scholar 

  17. Collantes, M., Hipe, M., Sorilla, J.L., Tolentino, L., Samson, B.: Simpatico: a text simplification system for senate and house bills. In: Proceedings of the 11th National Natural Language Processing Research Symposium, pp. 26–32 (2015)

    Google Scholar 

  18. Coster, W., Kauchak, D.: Simple English Wikipedia: a new text simplification task. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 665–669 (2011)

    Google Scholar 

  19. Dale, R.: Law and word order: NLP in legal tech. Nat. Lang. Eng. 25(1), 211–217 (2019)

    Article  Google Scholar 

  20. Elhadad, N., Sutaria, K.: Mining a lexicon of technical terms and lay equivalents. In: Biological, Translational, and Clinical Language Processing, pp. 49–56 (2007)

    Google Scholar 

  21. Evans, R., Orasan, C., Dornescu, I.: An evaluation of syntactic simplification rules for people with autism. In: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations, pp. 131–140. Association for Computational Linguistics (2014)

    Google Scholar 

  22. Flesch, R.: A new readability yardstick. J. Appl. Psychol. 32(3), 221 (1948)

    Article  Google Scholar 

  23. Gallegos, I., George, K.: The right to remain plain: summarization and simplification of legal documents. Unpublished (n.d.)

    Google Scholar 

  24. Hendrycks, D., Burns, C., Chen, A., Ball, S.: Cuad: an expert-annotated NLP dataset for legal contract review. arXiv preprint arXiv:2103.06268 (2021)

  25. Inui, K., Fujita, A., Takahashi, T., Iida, R., Iwakura, T.: Text simplification for reading assistance: a project note. In: Proceedings of the Second International Workshop on Paraphrasing, pp. 9–16 (2003)

    Google Scholar 

  26. Jiang, C., Maddela, M., Lan, W., Zhong, Y., Xu, W.: Neural CRF model for sentence alignment in text simplification. arXiv preprint arXiv:2005.02324 (2020)

  27. Kajiwara, T., Matsumoto, H., Yamamoto, K.: Selecting proper lexical paraphrase for children. In: Proceedings of the 25th Conference on Computational Linguistics and Speech Processing (ROCLING 2013), pp. 59–73 (2013)

    Google Scholar 

  28. Kalk, N.J., Pothier, D.D.: Patient information on schizophrenia on the internet. Psychiatric Bull. 32(11), 409–411 (2008)

    Google Scholar 

  29. Kincaid, J.P., Fishburne Jr., R.P., Rogers, R.L., Chissom, B.S.: Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, Naval Technical Training Command Millington TN Research Branch (1975)

    Google Scholar 

  30. Koreeda, Y., Manning, C.D.: Contractnli: a dataset for document-level natural language inference for contracts. arXiv preprint arXiv:2110.01799 (2021)

  31. Leitner, E., Rehm, G., Moreno-Schneider, J.: Fine-grained named entity recognition in legal documents. In: Acosta, M., Cudré-Mauroux, P., Maleshkova, M., Pellegrini, T., Sack, H., Sure-Vetter, Y. (eds.) SEMANTiCS 2019. LNCS, vol. 11702, pp. 272–287. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33220-4_20

    Chapter  Google Scholar 

  32. Leivaditi, S., Rossi, J., Kanoulas, E.: A benchmark for lease contract review. arXiv preprint arXiv:2010.10386 (2020)

  33. Leroy, G., Endicott, J.E.: Combining NLP with evidence-based methods to find text metrics related to perceived and actual text difficulty. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp. 749–754 (2012)

    Google Scholar 

  34. Lippi, M., et al.: Claudette: an automated detector of potentially unfair clauses in online terms of service. Artif. Intell. Law 27(2), 117–139 (2019)

    Google Scholar 

  35. Maddela, M., Alva-Manchego, F., Xu, W.: Controllable text simplification with explicit paraphrasing. arXiv preprint arXiv:2010.11004 (2020)

  36. Manor, L., Li, J.J.: Plain English summarization of contracts. arXiv preprint arXiv:1906.00424 (2019)

  37. Martin, L., Fan, A., de la Clergerie, É., Bordes, A., Sagot, B.: Muss: multilingual unsupervised sentence simplification by mining paraphrases. arXiv preprint arXiv:2005.00352 (2020)

  38. Harry, G., Laughlin, Mc.: Smog grading-a new readability formula. J. Read. 12(8), 639–646 (1969)

    Google Scholar 

  39. Niklaus, C., Cetto, M., Freitas, A., Handschuh, S.: Transforming complex sentences into a semantic hierarchy. arXiv preprint arXiv:1906.01038 (2019)

  40. Obar, J.A., Oeldorf-Hirsch, A.: The biggest lie on the internet: ignoring the privacy policies and terms of service policies of social networking services. Inf. Commun. Soc. 23(1), 128–147 (2020)

    Google Scholar 

  41. Paetzold, G., Specia, L.: Unsupervised lexical simplification for non-native speakers. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)

    Google Scholar 

  42. Pellow, D., Eskenazi, M.: An open corpus of everyday documents for simplification tasks. In: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), pp. 84–93 (2014)

    Google Scholar 

  43. Petersen, S.E., Ostendorf, M.: Text simplification for language learners: a corpus analysis. In: SLaTE (2007)

    Google Scholar 

  44. Qiang, J., Li, Y., Zhu, Y., Yuan, Y., Wu, X.: LSBERT: a simple framework for lexical simplification. arXiv preprint arXiv:2006.14939 (2020)

  45. Rello, L., Baeza-Yates, R., Dempere-Marco, L., Saggion, H.: Frequent words improve readability and short words improve understandability for people with dyslexia. In: Kotzé, P., Marsden, G., Lindgaard, G., Wesson, J., Winckler, M. (eds.) INTERACT 2013. LNCS, vol. 8120, pp. 203–219. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40498-6_15

    Chapter  Google Scholar 

  46. Shaghaghian, S., Feng, L.Y., Jafarpour, B., Pogrebnyakov, N.: Customizing contextualized language models for legal document reviews. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 2139–2148. IEEE (2020)

    Google Scholar 

  47. Shardlow, M.: A survey of automated text simplification. Int. J. Adv. Comput. Sci. Appl. 4(1), 58–70 (2014)

    Google Scholar 

  48. Siddharthan, A., Katsos, N.: Reformulating discourse connectives for non-expert readers. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 1002–1010 (2010)

    Google Scholar 

  49. Van Heuven, W.J.B., Mandera, P., Keuleers, E., Brysbaert, M.: Subtlex-UK: a new and improved word frequency database for British English. Q. J. Exp. Psychol. 67(6), 1176–1190 (2014)

    Google Scholar 

  50. Williams, R.T.: A table for rapid determination of revised Dale-Chall readability scores. Read. Teach. 26(2), 158–165 (1972)

    Google Scholar 

  51. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45 (2020)

    Google Scholar 

  52. Woodsend, K., Lapata, M.: Learning to simplify sentences with quasi-synchronous grammar and integer programming. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 409–420 (2011)

    Google Scholar 

  53. Wei, X., Callison-Burch, C., Napoles, C.: Problems in current text simplification research: new data can help. Trans. Assoc. Comput. Linguist. 3, 283–297 (2015)

    Article  Google Scholar 

  54. Yatskar, M., Pang, B., Danescu-Niculescu-Mizil, C., Lee, L.: For the sake of simplicity: unsupervised extraction of lexical simplifications from Wikipedia. arXiv preprint arXiv:1008.1986 (2010)

  55. Zhu, Z., Bernhard, D., Gurevych, I.: A monolingual tree-based translation model for sentence simplification. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 1353–1361 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reginald Neil C. Recario .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Justo, J.M., Recario, R.N.C. (2024). Text Simplification System for Legal Contract Review. In: Arai, K. (eds) Advances in Information and Communication. FICC 2024. Lecture Notes in Networks and Systems, vol 919. Springer, Cham. https://doi.org/10.1007/978-3-031-53960-2_8

Download citation

Publish with us

Policies and ethics