Using attention methods to predict judicial outcomes

Bertalan, Vithor Gomes Ferreira; Ruiz, Evandro Eduardo Seron

doi:10.1007/s10506-022-09342-7

Using attention methods to predict judicial outcomes

Original Research
Published: 27 December 2022

Volume 32, pages 87–115, (2024)
Cite this article

Artificial Intelligence and Law Aims and scope Submit manuscript

Vithor Gomes Ferreira Bertalan¹ &
Evandro Eduardo Seron Ruiz²

612 Accesses
1 Citation
3 Altmetric
Explore all metrics

A Correction to this article was published on 09 February 2023

This article has been updated

Abstract

The prediction of legal judgments is one of the most recognized fields in Natural Language Processing, Artificial Intelligence, and Law combined. By legal prediction, we mean intelligent systems capable of predicting specific judicial characteristics such as the judicial outcome, the judicial class, and the prediction of a particular case. In this study, we used an artificial intelligence classifier to predict the decisions of Brazilian courts. To this end, we developed a text crawler to extract data from official Brazilian electronic legal systems, consisting of two datasets of cases of second-degree murder and active corruption. We applied various classifiers, such as Support Vector Machines, Neural Networks, and others, to predict judicial outcomes by analyzing text features from the dataset. Our research demonstrated that Regression Trees, Gated Recurring Units, and Hierarchical Attention Networks tended to have higher metrics across our datasets. As the final goal, we searched the weights of one of the algorithms, Hierarchical Attention Networks, to find samples of the words that might be used to acquit or convict defendants based on their relevance to the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inducing Predictive Models for Decision Support in Administrative Adjudication

Multi-label classification of legislative contents with hierarchical label attention networks

Article 30 October 2021

Use of Natural Language Processing to Identify Inappropriate Content in Text

Change history

09 February 2023
A Correction to this paper has been published: https://doi.org/10.1007/s10506-023-09346-x

Notes

Tribunal de Justiça do Estado de São Paulo, Brasil, TJSP.
https://www.tjsp.jus.br/QuemSomos (in Brazilian Portuguese).
http://www.brazil.gov.br/about-brazil/news/2018/11/civil-law-tradition-guides-rights-in-brazil-but-common-law-is-also-present.
http://esaj.tjsp.jus.br/cjpg/.
https://github.com/vbertalan.

References

Alarie B, Niblett A, Yoon AH (2018) How artificial intelligence will affect the practice of law. Univ Tor Law J 68(supplement 1):106–124
Article Google Scholar
Aletras N, Tsarapatsanis D, Preoţiuc-Pietro D, Lampos V (2016) Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. Peer J Comput Sci 2:e93
Article Google Scholar
Alschner W, Skougarevskiy D (2017) Towards an automated production of legal texts using recurrent neural networks. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law, Association for Computing Machinery, New York, NY, USA, ICAIL ’17, p 229-232, https://doi.org/10.1145/3086512.3086536
Antonucci L, Crocetta C, d’Ovidio FD (2014) Evaluation of Italian judicial system. Proc Econ Financ 17:121–130
Article Google Scholar
Antos A, Nadhamuni N (2021) Practical guide to artificial intelligence and contract review. In: Research Handbook on Big Data Law, Edward Elgar Publishing
Ashley KD, Brüninghaus S (2009) Automatically classifying case texts and predicting outcomes. Artif Intell Law 17(2):125–165
Article Google Scholar
Balakrishnama S, Ganapathiraju A (1998) Linear discriminant analysis-a brief tutorial. Inst Signal Inf Process 18(1998):1–8
Google Scholar
Bishop CM (2006) Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg
Google Scholar
Branting LK, Yeh A, Weiss B, Merkhofer E, Brown B (2015) Inducing predictive models for decision support in administrative adjudication. In: AI Approaches to the Complexity of Legal Systems, Springer, pp 465–477
Cambria E, White B (2014) Jumping NLP curves: a review of natural language processing research [review article]. IEEE Comput Intell Mag 9(2):48–57. https://doi.org/10.1109/MCI.2014.2307227
Article Google Scholar
Chalkidis I, Fergadiotis M, Malakasiotis P, Aletras N, Androutsopoulos I (2019) Extreme multi-label legal text classification: A case study in eu legislation. arXiv preprint arXiv:1905.10892
Chantar HK, Corne DW (2011) Feature subset selection for Arabic document categorization using bpso-knn. In: 2011 Third World Congress on Nature and Biologically Inspired Computing, IEEE, pp 546–551
Chi Y, Zhang P, Wang F, Lu T, Gu N (2022) Legal judgement prediction of sentence commutation with multi-document information. In: CCF Conference on Computer Supported Cooperative Work and Social Computing, Springer, pp 473–487
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Dar JA, Srivastava KK, Lone SA (2022) Spectral features and optimal hierarchical attention networks for pulmonary abnormality detection from the respiratory sound signals. Biomed Signal Process Control 78:103905
Article Google Scholar
Desmet B, Hoste V (2014) Recognising suicidal messages in dutch social media. In: 9th international conference on language resources and evaluation (LREC), pp 830–835
de Sa CA, Santos RLdS, Moura RS (2017) An approach for defining the author reputation of comments on products. In: International Conference on Applications of Natural Language to Information Systems, Springer, pp 326–331
Do PK, Nguyen HT, Tran CX, Nguyen MT, Nguyen ML (2017) Legal question answering using ranking svm and deep convolutional neural network. arXiv preprint arXiv:1703.05320
Gao S, Young MT, Qiu JX, Yoon HJ, Christian JB, Fearn PA, Tourassi GD, Ramanthan A (2018) Hierarchical attention networks for information extraction from cancer pathology reports. J Am Med Inf Assoc 25(3):321–330
Article Google Scholar
Gokhale R, Fasli M (2017) Deploying a co-training algorithm to classify human-rights abuses. In: 2017 International Conference on the Frontiers and Advances in Data Science (FADS), IEEE, pp 108–113
Hartmann N, Fonseca E, Shulby C, Treviso M, Rodrigues J, Aluisio S (2017) Portuguese word embeddings: evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025
He X, Shi S, Geng X, Xu L (2022) Hierarchical attention-based context-aware network for red tide forecasting. Appl Soft Comput 127:109337
Article Google Scholar
Kanakaraj M, Guddeti RMR (2015) Performance analysis of ensemble methods on twitter sentiment analysis using NLP techniques. In: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), IEEE, pp 169–170
Kastellec JP (2010) The statistical analysis of judicial decisions and legal rules with classification trees. J Empir Leg Stud 7(2):202–230
Article Google Scholar
Krestel R, Fankhauser P, Nejdl W (2009) Latent dirichlet allocation for tag recommendation. In: Proceedings of the third ACM conference on Recommender systems, pp 61–68
Kufandirimbwa O, Kuranga C (2012) Towards judicial data mining: arguing for adoption in the judicial system. Online J Phys Environ Sci Res 1(2):15–21
Google Scholar
Le TTN, Shirai K, Le Nguyen M, Shimazu A (2015) Extracting indices from Japanese legal documents. Artif Intell Law 23(4):315–344
Article Google Scholar
Li X, Chen W, Wang T, Huang W (2017) Target-specific convolutional bi-directional lstm neural network for political ideology analysis. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, Springer, pp 64–72
Liu Z, Chen H (2017) A predictive performance comparison of machine learning models for judicial cases. In: 2017 IEEE Symposium series on computational intelligence (SSCI), IEEE, pp 1–6
Liu Z, Tu C, Sun M (2019) Legal cause prediction with inner descriptions and outer hierarchies. In: China National Conference on Chinese Computational Linguistics, Springer, pp 573–586
Loh WY (2011) Classification and regression trees. Wiley interdiscip Rev: Data Min Knowl Discov 1(1):14–23
Google Scholar
Luo B, Feng Y, Xu J, Zhang X, Zhao D (2017) Learning to predict charges for criminal cases with legal basis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, pp 2727–2736
Ma J, Gao W, Joty S, Wong KF (2019) Sentence-level evidence embedding for claim verification with hierarchical attention networks. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 2561–2571
Mac Kim S, Xu Q, Qu L, Wan S, Paris C (2017) Demographic inference on twitter using recursive neural networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 471–477
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New York, NY, USA
Book Google Scholar
McShane BB, Watson OP, Baker T, Griffith SJ (2012) Predicting securities fraud settlements and amounts: a hierarchical bayesian model of federal securities class action lawsuits. J Empir Leg Stud 9(3):482–510
Article Google Scholar
Moens MF (2001) Innovative techniques for legal text retrieval. Artif Intell Law 9(1):29–57
Article Google Scholar
Obasi CK, Ugwu C (2015) Feature selection and vectorization in legal case documents using chi-square statistical analysis and naïve bayes approaches. IOSR J Comput Eng 17(2):42–50
Google Scholar
Oliveira FLd, Cunha LG (2020) The indicators on the brazilian judiciary: limitations, challenges and the use of technology. Revista Direito GV 16(1)
Pavlinek M, Podgorelec V (2017) Text classification method based on self-training and lda topic models. Expert Syst Appl 80:83–93
Article Google Scholar
Pelle R, Alcântara C, Moreira VP (2018) A classifier ensemble for offensive text detection. In: Proceedings of the 24th Brazilian Symposium on Multimedia and the Web, Association for Computing Machinery, New York, NY, USA, WebMedia ’18, p 237-243
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Rao A, Spasojevic N (2016) Actionable and political text classification using word embeddings and lstm. arXiv preprint arXiv:1607.02501
Remmits Y (2017) Finding the topics of case law: Latent dirichlet allocation on supreme court decisions. PhD thesis, Radboud Universiteit
Rios-Figueroa J (2006) Judicial independence and corruption: An analysis of latin america. Available at SSRN 912924
Roy D, Dutta M (2022) Optimal hierarchical attention network-based sentiment analysis for movie recommendation. Soc Netw Anal Min 12(1):1–16
Article MathSciNet Google Scholar
Sannier N, Adedjouma M, Sabetzadeh M, Briand L (2017) An automated framework for detection and resolution of cross references in legal texts. Requir Eng 22(2):215–237
Article Google Scholar
Sulea OM, Zampieri M, Malmasi S, Vela M, Dinu LP, Van Genabith J (2017a) Exploring the use of text classification in the legal domain. arXiv preprint arXiv:1710.09306
Sulea OM, Zampieri M, Vela M, Van Genabith J (2017b) Predicting the law area and decisions of french supreme court cases. arXiv preprint arXiv:1708.01681
Sun C, Zhang Y, Liu X, Wu F (2020) Legal Intelligence: Algorithmic, Data, and Social Challenges. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 2464–2467
Surden H (2014) Machine learning and law. Wash Law Rev 89:87–115
Google Scholar
Tamilarasan Ramasamy DJJ (2022) Early risk detection of depression from social media posts using hierarchical attention networks. J Algebr Stat 13(1):483–489
Google Scholar
Tarnpradab S, Liu F, Hua KA (2017) Toward extractive summarization of online forum discussions via hierarchical attention networks. In: The Thirtieth International Flairs Conference
Tran OT, Ngo BX, Le Nguyen M, Shimazu A (2014) Automated reference resolution in legal texts. Artif Intell Law 22(1):29–60
Article Google Scholar
Turian J, Ratinov L, Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for Computational Linguistics, pp 384–394
Wang J, Deng H, Liu B, Hu A, Liang J, Fan L, Zheng X, Wang T, Lei J (2020) Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed. J Med Internet Res 22(1):e16816. https://doi.org/10.2196/16816, URL http://www.ncbi.nlm.nih.gov/pubmed/32012074
Wenguan W, Yunwen C, Hua C, Yanneng Z, Huiyu Y (2019) Judicial document intellectual processing using hybrid deep neural networks. J Tsinghua Univ (Sci Technol) 59(7):505–511
Google Scholar
Xie J, Liu X, Dajun Zeng D (2018) Mining e-cigarette adverse events in social media using bi-lstm recurrent neural network with word embedding representation. J Am Med Inf Assoc 25(1):72–80
Article Google Scholar
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, Association for Computational Linguistics, San Diego, California, pp 1480–1489
Zeng Y, Wang R, Zeleznikow J, Kemp E (2007) A knowledge representation model for the intelligent retrieval of legal cases. Int J Law Inf Technol 15(3):299–319
Article Google Scholar
Zhang Z, Robinson D, Tepper J (2018) Detecting hate speech on twitter using a convolution-gru based deep neural network. In: European semantic web conference, Springer, pp 745–760

Download references

Author information

Authors and Affiliations

Département de Génie Informatique et Génie Logiciel de l’École Polytechnique de Montréal, Université de Montréal, Montréal, Canada
Vithor Gomes Ferreira Bertalan
Departamento de Computação e Matemática da FFCLRP, Universidade de São Paulo, Ribeirão Preto, Brazil
Evandro Eduardo Seron Ruiz

Authors

Vithor Gomes Ferreira Bertalan
View author publications
You can also search for this author in PubMed Google Scholar
Evandro Eduardo Seron Ruiz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vithor Gomes Ferreira Bertalan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bertalan, V.G.F., Ruiz, E.E.S. Using attention methods to predict judicial outcomes. Artif Intell Law 32, 87–115 (2024). https://doi.org/10.1007/s10506-022-09342-7

Download citation

Accepted: 08 September 2022
Published: 27 December 2022
Issue Date: March 2024
DOI: https://doi.org/10.1007/s10506-022-09342-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using attention methods to predict judicial outcomes

Abstract

Access this article

Similar content being viewed by others

Inducing Predictive Models for Decision Support in Administrative Adjudication

Multi-label classification of legislative contents with hierarchical label attention networks

Use of Natural Language Processing to Identify Inappropriate Content in Text

Change history

09 February 2023

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using attention methods to predict judicial outcomes

Abstract

Access this article

Similar content being viewed by others

Inducing Predictive Models for Decision Support in Administrative Adjudication

Multi-label classification of legislative contents with hierarchical label attention networks

Use of Natural Language Processing to Identify Inappropriate Content in Text

Change history

09 February 2023

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation