Skip to main content

Advertisement

Log in

Using attention methods to predict judicial outcomes

  • Original Research
  • Published:
Artificial Intelligence and Law Aims and scope Submit manuscript

A Correction to this article was published on 09 February 2023

This article has been updated

Abstract

The prediction of legal judgments is one of the most recognized fields in Natural Language Processing, Artificial Intelligence, and Law combined. By legal prediction, we mean intelligent systems capable of predicting specific judicial characteristics such as the judicial outcome, the judicial class, and the prediction of a particular case. In this study, we used an artificial intelligence classifier to predict the decisions of Brazilian courts. To this end, we developed a text crawler to extract data from official Brazilian electronic legal systems, consisting of two datasets of cases of second-degree murder and active corruption. We applied various classifiers, such as Support Vector Machines, Neural Networks, and others, to predict judicial outcomes by analyzing text features from the dataset. Our research demonstrated that Regression Trees, Gated Recurring Units, and Hierarchical Attention Networks tended to have higher metrics across our datasets. As the final goal, we searched the weights of one of the algorithms, Hierarchical Attention Networks, to find samples of the words that might be used to acquit or convict defendants based on their relevance to the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Source: Yang et al. (2016)

Fig. 3

Source Yang et al. (2016)

Fig. 4

Similar content being viewed by others

Change history

Notes

  1. Tribunal de Justiça do Estado de São Paulo, Brasil, TJSP.

  2. https://www.tjsp.jus.br/QuemSomos (in Brazilian Portuguese).

  3. http://www.brazil.gov.br/about-brazil/news/2018/11/civil-law-tradition-guides-rights-in-brazil-but-common-law-is-also-present.

  4. http://esaj.tjsp.jus.br/cjpg/.

  5. https://github.com/vbertalan.

References

  • Alarie B, Niblett A, Yoon AH (2018) How artificial intelligence will affect the practice of law. Univ Tor Law J 68(supplement 1):106–124

    Article  Google Scholar 

  • Aletras N, Tsarapatsanis D, Preoţiuc-Pietro D, Lampos V (2016) Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. Peer J Comput Sci 2:e93

    Article  Google Scholar 

  • Alschner W, Skougarevskiy D (2017) Towards an automated production of legal texts using recurrent neural networks. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law, Association for Computing Machinery, New York, NY, USA, ICAIL ’17, p 229-232, https://doi.org/10.1145/3086512.3086536

  • Antonucci L, Crocetta C, d’Ovidio FD (2014) Evaluation of Italian judicial system. Proc Econ Financ 17:121–130

    Article  Google Scholar 

  • Antos A, Nadhamuni N (2021) Practical guide to artificial intelligence and contract review. In: Research Handbook on Big Data Law, Edward Elgar Publishing

  • Ashley KD, Brüninghaus S (2009) Automatically classifying case texts and predicting outcomes. Artif Intell Law 17(2):125–165

    Article  Google Scholar 

  • Balakrishnama S, Ganapathiraju A (1998) Linear discriminant analysis-a brief tutorial. Inst Signal Inf Process 18(1998):1–8

    Google Scholar 

  • Bishop CM (2006) Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg

    Google Scholar 

  • Branting LK, Yeh A, Weiss B, Merkhofer E, Brown B (2015) Inducing predictive models for decision support in administrative adjudication. In: AI Approaches to the Complexity of Legal Systems, Springer, pp 465–477

  • Cambria E, White B (2014) Jumping NLP curves: a review of natural language processing research [review article]. IEEE Comput Intell Mag 9(2):48–57. https://doi.org/10.1109/MCI.2014.2307227

    Article  Google Scholar 

  • Chalkidis I, Fergadiotis M, Malakasiotis P, Aletras N, Androutsopoulos I (2019) Extreme multi-label legal text classification: A case study in eu legislation. arXiv preprint arXiv:1905.10892

  • Chantar HK, Corne DW (2011) Feature subset selection for Arabic document categorization using bpso-knn. In: 2011 Third World Congress on Nature and Biologically Inspired Computing, IEEE, pp 546–551

  • Chi Y, Zhang P, Wang F, Lu T, Gu N (2022) Legal judgement prediction of sentence commutation with multi-document information. In: CCF Conference on Computer Supported Cooperative Work and Social Computing, Springer, pp 473–487

  • Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  • Dar JA, Srivastava KK, Lone SA (2022) Spectral features and optimal hierarchical attention networks for pulmonary abnormality detection from the respiratory sound signals. Biomed Signal Process Control 78:103905

    Article  Google Scholar 

  • Desmet B, Hoste V (2014) Recognising suicidal messages in dutch social media. In: 9th international conference on language resources and evaluation (LREC), pp 830–835

  • de Sa CA, Santos RLdS, Moura RS (2017) An approach for defining the author reputation of comments on products. In: International Conference on Applications of Natural Language to Information Systems, Springer, pp 326–331

  • Do PK, Nguyen HT, Tran CX, Nguyen MT, Nguyen ML (2017) Legal question answering using ranking svm and deep convolutional neural network. arXiv preprint arXiv:1703.05320

  • Gao S, Young MT, Qiu JX, Yoon HJ, Christian JB, Fearn PA, Tourassi GD, Ramanthan A (2018) Hierarchical attention networks for information extraction from cancer pathology reports. J Am Med Inf Assoc 25(3):321–330

    Article  Google Scholar 

  • Gokhale R, Fasli M (2017) Deploying a co-training algorithm to classify human-rights abuses. In: 2017 International Conference on the Frontiers and Advances in Data Science (FADS), IEEE, pp 108–113

  • Hartmann N, Fonseca E, Shulby C, Treviso M, Rodrigues J, Aluisio S (2017) Portuguese word embeddings: evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025

  • He X, Shi S, Geng X, Xu L (2022) Hierarchical attention-based context-aware network for red tide forecasting. Appl Soft Comput 127:109337

    Article  Google Scholar 

  • Kanakaraj M, Guddeti RMR (2015) Performance analysis of ensemble methods on twitter sentiment analysis using NLP techniques. In: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), IEEE, pp 169–170

  • Kastellec JP (2010) The statistical analysis of judicial decisions and legal rules with classification trees. J Empir Leg Stud 7(2):202–230

    Article  Google Scholar 

  • Krestel R, Fankhauser P, Nejdl W (2009) Latent dirichlet allocation for tag recommendation. In: Proceedings of the third ACM conference on Recommender systems, pp 61–68

  • Kufandirimbwa O, Kuranga C (2012) Towards judicial data mining: arguing for adoption in the judicial system. Online J Phys Environ Sci Res 1(2):15–21

    Google Scholar 

  • Le TTN, Shirai K, Le Nguyen M, Shimazu A (2015) Extracting indices from Japanese legal documents. Artif Intell Law 23(4):315–344

    Article  Google Scholar 

  • Li X, Chen W, Wang T, Huang W (2017) Target-specific convolutional bi-directional lstm neural network for political ideology analysis. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, Springer, pp 64–72

  • Liu Z, Chen H (2017) A predictive performance comparison of machine learning models for judicial cases. In: 2017 IEEE Symposium series on computational intelligence (SSCI), IEEE, pp 1–6

  • Liu Z, Tu C, Sun M (2019) Legal cause prediction with inner descriptions and outer hierarchies. In: China National Conference on Chinese Computational Linguistics, Springer, pp 573–586

  • Loh WY (2011) Classification and regression trees. Wiley interdiscip Rev: Data Min Knowl Discov 1(1):14–23

    Google Scholar 

  • Luo B, Feng Y, Xu J, Zhang X, Zhao D (2017) Learning to predict charges for criminal cases with legal basis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, pp 2727–2736

  • Ma J, Gao W, Joty S, Wong KF (2019) Sentence-level evidence embedding for claim verification with hierarchical attention networks. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 2561–2571

  • Mac Kim S, Xu Q, Qu L, Wan S, Paris C (2017) Demographic inference on twitter using recursive neural networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 471–477

  • Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New York, NY, USA

    Book  Google Scholar 

  • McShane BB, Watson OP, Baker T, Griffith SJ (2012) Predicting securities fraud settlements and amounts: a hierarchical bayesian model of federal securities class action lawsuits. J Empir Leg Stud 9(3):482–510

    Article  Google Scholar 

  • Moens MF (2001) Innovative techniques for legal text retrieval. Artif Intell Law 9(1):29–57

    Article  Google Scholar 

  • Obasi CK, Ugwu C (2015) Feature selection and vectorization in legal case documents using chi-square statistical analysis and naïve bayes approaches. IOSR J Comput Eng 17(2):42–50

    Google Scholar 

  • Oliveira FLd, Cunha LG (2020) The indicators on the brazilian judiciary: limitations, challenges and the use of technology. Revista Direito GV 16(1)

  • Pavlinek M, Podgorelec V (2017) Text classification method based on self-training and lda topic models. Expert Syst Appl 80:83–93

    Article  Google Scholar 

  • Pelle R, Alcântara C, Moreira VP (2018) A classifier ensemble for offensive text detection. In: Proceedings of the 24th Brazilian Symposium on Multimedia and the Web, Association for Computing Machinery, New York, NY, USA, WebMedia ’18, p 237-243

  • Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  • Rao A, Spasojevic N (2016) Actionable and political text classification using word embeddings and lstm. arXiv preprint arXiv:1607.02501

  • Remmits Y (2017) Finding the topics of case law: Latent dirichlet allocation on supreme court decisions. PhD thesis, Radboud Universiteit

  • Rios-Figueroa J (2006) Judicial independence and corruption: An analysis of latin america. Available at SSRN 912924

  • Roy D, Dutta M (2022) Optimal hierarchical attention network-based sentiment analysis for movie recommendation. Soc Netw Anal Min 12(1):1–16

    Article  MathSciNet  Google Scholar 

  • Sannier N, Adedjouma M, Sabetzadeh M, Briand L (2017) An automated framework for detection and resolution of cross references in legal texts. Requir Eng 22(2):215–237

    Article  Google Scholar 

  • Sulea OM, Zampieri M, Malmasi S, Vela M, Dinu LP, Van Genabith J (2017a) Exploring the use of text classification in the legal domain. arXiv preprint arXiv:1710.09306

  • Sulea OM, Zampieri M, Vela M, Van Genabith J (2017b) Predicting the law area and decisions of french supreme court cases. arXiv preprint arXiv:1708.01681

  • Sun C, Zhang Y, Liu X, Wu F (2020) Legal Intelligence: Algorithmic, Data, and Social Challenges. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 2464–2467

  • Surden H (2014) Machine learning and law. Wash Law Rev 89:87–115

    Google Scholar 

  • Tamilarasan Ramasamy DJJ (2022) Early risk detection of depression from social media posts using hierarchical attention networks. J Algebr Stat 13(1):483–489

    Google Scholar 

  • Tarnpradab S, Liu F, Hua KA (2017) Toward extractive summarization of online forum discussions via hierarchical attention networks. In: The Thirtieth International Flairs Conference

  • Tran OT, Ngo BX, Le Nguyen M, Shimazu A (2014) Automated reference resolution in legal texts. Artif Intell Law 22(1):29–60

    Article  Google Scholar 

  • Turian J, Ratinov L, Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for Computational Linguistics, pp 384–394

  • Wang J, Deng H, Liu B, Hu A, Liang J, Fan L, Zheng X, Wang T, Lei J (2020) Systematic Evaluation of Research Progress on Natural Language Processing in Medicine Over the Past 20 Years: Bibliometric Study on PubMed. J Med Internet Res 22(1):e16816. https://doi.org/10.2196/16816, URL http://www.ncbi.nlm.nih.gov/pubmed/32012074

  • Wenguan W, Yunwen C, Hua C, Yanneng Z, Huiyu Y (2019) Judicial document intellectual processing using hybrid deep neural networks. J Tsinghua Univ (Sci Technol) 59(7):505–511

    Google Scholar 

  • Xie J, Liu X, Dajun Zeng D (2018) Mining e-cigarette adverse events in social media using bi-lstm recurrent neural network with word embedding representation. J Am Med Inf Assoc 25(1):72–80

    Article  Google Scholar 

  • Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, Association for Computational Linguistics, San Diego, California, pp 1480–1489

  • Zeng Y, Wang R, Zeleznikow J, Kemp E (2007) A knowledge representation model for the intelligent retrieval of legal cases. Int J Law Inf Technol 15(3):299–319

    Article  Google Scholar 

  • Zhang Z, Robinson D, Tepper J (2018) Detecting hate speech on twitter using a convolution-gru based deep neural network. In: European semantic web conference, Springer, pp 745–760

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vithor Gomes Ferreira Bertalan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bertalan, V.G.F., Ruiz, E.E.S. Using attention methods to predict judicial outcomes. Artif Intell Law 32, 87–115 (2024). https://doi.org/10.1007/s10506-022-09342-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10506-022-09342-7

Keywords

Navigation