Abstract
The prediction of legal judgments is one of the most recognized fields in Natural Language Processing, Artificial Intelligence, and Law combined. By legal prediction, we mean intelligent systems capable of predicting specific judicial characteristics such as the judicial outcome, the judicial class, and the prediction of a particular case. In this study, we used an artificial intelligence classifier to predict the decisions of Brazilian courts. To this end, we developed a text crawler to extract data from official Brazilian electronic legal systems, consisting of two datasets of cases of second-degree murder and active corruption. We applied various classifiers, such as Support Vector Machines, Neural Networks, and others, to predict judicial outcomes by analyzing text features from the dataset. Our research demonstrated that Regression Trees, Gated Recurring Units, and Hierarchical Attention Networks tended to have higher metrics across our datasets. As the final goal, we searched the weights of one of the algorithms, Hierarchical Attention Networks, to find samples of the words that might be used to acquit or convict defendants based on their relevance to the algorithm.
Similar content being viewed by others
Change history
09 February 2023
A Correction to this paper has been published: https://doi.org/10.1007/s10506-023-09346-x
Notes
Tribunal de Justiça do Estado de São Paulo, Brasil, TJSP.
https://www.tjsp.jus.br/QuemSomos (in Brazilian Portuguese).
References
Alarie B, Niblett A, Yoon AH (2018) How artificial intelligence will affect the practice of law. Univ Tor Law J 68(supplement 1):106–124
Aletras N, Tsarapatsanis D, Preoţiuc-Pietro D, Lampos V (2016) Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. Peer J Comput Sci 2:e93
Alschner W, Skougarevskiy D (2017) Towards an automated production of legal texts using recurrent neural networks. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law, Association for Computing Machinery, New York, NY, USA, ICAIL ’17, p 229-232, https://doi.org/10.1145/3086512.3086536
Antonucci L, Crocetta C, d’Ovidio FD (2014) Evaluation of Italian judicial system. Proc Econ Financ 17:121–130
Antos A, Nadhamuni N (2021) Practical guide to artificial intelligence and contract review. In: Research Handbook on Big Data Law, Edward Elgar Publishing
Ashley KD, Brüninghaus S (2009) Automatically classifying case texts and predicting outcomes. Artif Intell Law 17(2):125–165
Balakrishnama S, Ganapathiraju A (1998) Linear discriminant analysis-a brief tutorial. Inst Signal Inf Process 18(1998):1–8
Bishop CM (2006) Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg
Branting LK, Yeh A, Weiss B, Merkhofer E, Brown B (2015) Inducing predictive models for decision support in administrative adjudication. In: AI Approaches to the Complexity of Legal Systems, Springer, pp 465–477
Cambria E, White B (2014) Jumping NLP curves: a review of natural language processing research [review article]. IEEE Comput Intell Mag 9(2):48–57. https://doi.org/10.1109/MCI.2014.2307227
Chalkidis I, Fergadiotis M, Malakasiotis P, Aletras N, Androutsopoulos I (2019) Extreme multi-label legal text classification: A case study in eu legislation. arXiv preprint arXiv:1905.10892
Chantar HK, Corne DW (2011) Feature subset selection for Arabic document categorization using bpso-knn. In: 2011 Third World Congress on Nature and Biologically Inspired Computing, IEEE, pp 546–551
Chi Y, Zhang P, Wang F, Lu T, Gu N (2022) Legal judgement prediction of sentence commutation with multi-document information. In: CCF Conference on Computer Supported Cooperative Work and Social Computing, Springer, pp 473–487
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Dar JA, Srivastava KK, Lone SA (2022) Spectral features and optimal hierarchical attention networks for pulmonary abnormality detection from the respiratory sound signals. Biomed Signal Process Control 78:103905
Desmet B, Hoste V (2014) Recognising suicidal messages in dutch social media. In: 9th international conference on language resources and evaluation (LREC), pp 830–835
de Sa CA, Santos RLdS, Moura RS (2017) An approach for defining the author reputation of comments on products. In: International Conference on Applications of Natural Language to Information Systems, Springer, pp 326–331
Do PK, Nguyen HT, Tran CX, Nguyen MT, Nguyen ML (2017) Legal question answering using ranking svm and deep convolutional neural network. arXiv preprint arXiv:1703.05320
Gao S, Young MT, Qiu JX, Yoon HJ, Christian JB, Fearn PA, Tourassi GD, Ramanthan A (2018) Hierarchical attention networks for information extraction from cancer pathology reports. J Am Med Inf Assoc 25(3):321–330
Gokhale R, Fasli M (2017) Deploying a co-training algorithm to classify human-rights abuses. In: 2017 International Conference on the Frontiers and Advances in Data Science (FADS), IEEE, pp 108–113
Hartmann N, Fonseca E, Shulby C, Treviso M, Rodrigues J, Aluisio S (2017) Portuguese word embeddings: evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025
He X, Shi S, Geng X, Xu L (2022) Hierarchical attention-based context-aware network for red tide forecasting. Appl Soft Comput 127:109337
Kanakaraj M, Guddeti RMR (2015) Performance analysis of ensemble methods on twitter sentiment analysis using NLP techniques. In: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), IEEE, pp 169–170
Kastellec JP (2010) The statistical analysis of judicial decisions and legal rules with classification trees. J Empir Leg Stud 7(2):202–230
Krestel R, Fankhauser P, Nejdl W (2009) Latent dirichlet allocation for tag recommendation. In: Proceedings of the third ACM conference on Recommender systems, pp 61–68
Kufandirimbwa O, Kuranga C (2012) Towards judicial data mining: arguing for adoption in the judicial system. Online J Phys Environ Sci Res 1(2):15–21
Le TTN, Shirai K, Le Nguyen M, Shimazu A (2015) Extracting indices from Japanese legal documents. Artif Intell Law 23(4):315–344
Li X, Chen W, Wang T, Huang W (2017) Target-specific convolutional bi-directional lstm neural network for political ideology analysis. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, Springer, pp 64–72
Liu Z, Chen H (2017) A predictive performance comparison of machine learning models for judicial cases. In: 2017 IEEE Symposium series on computational intelligence (SSCI), IEEE, pp 1–6
Liu Z, Tu C, Sun M (2019) Legal cause prediction with inner descriptions and outer hierarchies. In: China National Conference on Chinese Computational Linguistics, Springer, pp 573–586
Loh WY (2011) Classification and regression trees. Wiley interdiscip Rev: Data Min Knowl Discov 1(1):14–23
Luo B, Feng Y, Xu J, Zhang X, Zhao D (2017) Learning to predict charges for criminal cases with legal basis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, pp 2727–2736
Ma J, Gao W, Joty S, Wong KF (2019) Sentence-level evidence embedding for claim verification with hierarchical attention networks. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 2561–2571
Mac Kim S, Xu Q, Qu L, Wan S, Paris C (2017) Demographic inference on twitter using recursive neural networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 471–477
Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New York, NY, USA
McShane BB, Watson OP, Baker T, Griffith SJ (2012) Predicting securities fraud settlements and amounts: a hierarchical bayesian model of federal securities class action lawsuits. J Empir Leg Stud 9(3):482–510
Moens MF (2001) Innovative techniques for legal text retrieval. Artif Intell Law 9(1):29–57
Obasi CK, Ugwu C (2015) Feature selection and vectorization in legal case documents using chi-square statistical analysis and naïve bayes approaches. IOSR J Comput Eng 17(2):42–50
Oliveira FLd, Cunha LG (2020) The indicators on the brazilian judiciary: limitations, challenges and the use of technology. Revista Direito GV 16(1)
Pavlinek M, Podgorelec V (2017) Text classification method based on self-training and lda topic models. Expert Syst Appl 80:83–93
Pelle R, Alcântara C, Moreira VP (2018) A classifier ensemble for offensive text detection. In: Proceedings of the 24th Brazilian Symposium on Multimedia and the Web, Association for Computing Machinery, New York, NY, USA, WebMedia ’18, p 237-243
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Rao A, Spasojevic N (2016) Actionable and political text classification using word embeddings and lstm. arXiv preprint arXiv:1607.02501
Remmits Y (2017) Finding the topics of case law: Latent dirichlet allocation on supreme court decisions. PhD thesis, Radboud Universiteit
Rios-Figueroa J (2006) Judicial independence and corruption: An analysis of latin america. Available at SSRN 912924
Roy D, Dutta M (2022) Optimal hierarchical attention network-based sentiment analysis for movie recommendation. Soc Netw Anal Min 12(1):1–16
Sannier N, Adedjouma M, Sabetzadeh M, Briand L (2017) An automated framework for detection and resolution of cross references in legal texts. Requir Eng 22(2):215–237
Sulea OM, Zampieri M, Malmasi S, Vela M, Dinu LP, Van Genabith J (2017a) Exploring the use of text classification in the legal domain. arXiv preprint arXiv:1710.09306
Sulea OM, Zampieri M, Vela M, Van Genabith J (2017b) Predicting the law area and decisions of french supreme court cases. arXiv preprint arXiv:1708.01681
Sun C, Zhang Y, Liu X, Wu F (2020) Legal Intelligence: Algorithmic, Data, and Social Challenges. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 2464–2467
Surden H (2014) Machine learning and law. Wash Law Rev 89:87–115
Tamilarasan Ramasamy DJJ (2022) Early risk detection of depression from social media posts using hierarchical attention networks. J Algebr Stat 13(1):483–489
Tarnpradab S, Liu F, Hua KA (2017) Toward extractive summarization of online forum discussions via hierarchical attention networks. In: The Thirtieth International Flairs Conference
Tran OT, Ngo BX, Le Nguyen M, Shimazu A (2014) Automated reference resolution in legal texts. Artif Intell Law 22(1):29–60
Turian J, Ratinov L, Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for Computational Linguistics, pp 384–394
Wang J, Deng H, Liu B, Hu A, Liang J, Fan L, Zheng X, Wang T, Lei J (2020) Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed. J Med Internet Res 22(1):e16816. https://doi.org/10.2196/16816, URL http://www.ncbi.nlm.nih.gov/pubmed/32012074
Wenguan W, Yunwen C, Hua C, Yanneng Z, Huiyu Y (2019) Judicial document intellectual processing using hybrid deep neural networks. J Tsinghua Univ (Sci Technol) 59(7):505–511
Xie J, Liu X, Dajun Zeng D (2018) Mining e-cigarette adverse events in social media using bi-lstm recurrent neural network with word embedding representation. J Am Med Inf Assoc 25(1):72–80
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, Association for Computational Linguistics, San Diego, California, pp 1480–1489
Zeng Y, Wang R, Zeleznikow J, Kemp E (2007) A knowledge representation model for the intelligent retrieval of legal cases. Int J Law Inf Technol 15(3):299–319
Zhang Z, Robinson D, Tepper J (2018) Detecting hate speech on twitter using a convolution-gru based deep neural network. In: European semantic web conference, Springer, pp 745–760
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bertalan, V.G.F., Ruiz, E.E.S. Using attention methods to predict judicial outcomes. Artif Intell Law 32, 87–115 (2024). https://doi.org/10.1007/s10506-022-09342-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10506-022-09342-7