Abstract
Summarization of legal case judgement documents is a practical and challenging problem, for which many summarization algorithms of different varieties have been tried. In this work, rather than developing yet another summarization algorithm, we investigate if intelligently ensembling (combining) the outputs of multiple (base) summarization algorithms can lead to better summaries of legal case judgements than any of the base algorithms. Using two datasets of case judgement documents from the Indian Supreme Court, one with extractive gold standard summaries and the other with abstractive gold standard summaries, we apply various ensembling techniques on summaries generated by a wide variety of summarization algorithms. The ensembling methods applied range from simple voting-based methods to ranking-based and graph-based ensembling methods. We show that many of our ensembling methods yield summaries that are better than the summaries produced by any of the individual base algorithms, in terms of ROUGE and METEOR scores.
Similar content being viewed by others
Notes
A few sentences present in the original legal documents have been modified slightly by the experts to improve upon the grammatical flow of the sentences.
Since there is only one supervised domain-specific algorithm, we do not separately consider domain-specific and domain-independent algorithms among supervised ones.
References
Ali S, Tirumala SS, Sarrafzadeh A (2015) Ensemble learning methods for decision making: Status and future prospects. In: Proceedings of international conference on machine learning and cybernetics (ICMLC), pp 211–216
Banerjee S, Lavie A (2005) METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72
Bhattacharya P, Hiware K, Rajgaria S, et al (2019) A comparative study of summarization algorithms applied to legal case judgments. In: ECIR
Bhattacharya P, Poddar S, Rudra K, et al (2021) Incorporating domain knowledge for extractive summarization of legal case documents. In: Proc. international conference on artificial intelligence and law
Collins E, Augenstein I, Riedel S (2017) A supervised approach to extractive summarisation of scientific papers. In: Proceedings of the 21st conference on computational natural language learning (CoNLL 2017), pp 195–205
Deroy A, Bhattacharya P, Ghosh K, et al (2021) An analytical study of algorithmic and expert summaries of legal cases. In: Legal knowledge and information systems. IOS Press, pp 90–99
Dong X, Yu Z, Cao W et al (2019) A survey on ensemble learning. Front Comp Sci 14:241–258
Dutta S, Chandra V, Mehra K et al (2018) Ensemble algorithms for microblog summarization. IEEE Intell Syst 33(3):4–14
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Fabbri AR, Kryściński W, McCann B et al (2021) SummEval: re-evaluating summarization evaluation. Trans Assoc Comput Linguist 9:391–409
Farzindar A, Lapalme G (2004) Letsum, an automatic legal text summarizing system. In: JURIX
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp 855–864
He Z, Chen C, Bu J, et al (2012) Document summarization based on data reconstruction. In: AAAI
Kleinberg JM (1999) Hubs, authorities, and communities. ACM Comput Surv (CSUR) 31:5–7
Kobayashi H (2018) Frustratingly easy model ensemble for abstractive summarization. In: Proceedings of the conference on empirical methods in natural language processing, pp 4165–4176
Li K, Han Y (2010) Study of selective ensemble learning method and its diversity based on decision tree and neural network. In: Proceedings of Chinese control and decision conference, pp 1310–1315
Lin CY (2004) ROUGE: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Liu CL, Chen KC (2019) Extracting the gist of Chinese judgments of the supreme court. In: ICAIL
Liu Y (2019) Fine-tune BERT for extractive summarization. ArXiv:1903.10318
Mallick C, Das AK, Ding W et al (2021) Ensemble summarization of bio-medical articles integrating clustering and multi-objective evolutionary algorithms. Appl Soft Comput 106(107):347
Maslov S, Redner S (2008) Promise and pitfalls of extending google’s pagerank algorithm to citation networks. J Neurosci 28(44):11,103-11,105
Mehta P, Majumder P (2018) Effective aggregation of various summarization techniques. Inf Process Manage 54(2):145–158
Moawad I, Aref M (2012) Semantic graph reduction approach for abstractive text summarization. In: International conference on computer engineering and systems, pp 132–138
Mohammadi M, Rezaei J (2020) Ensemble ranking: aggregation of rankings produced by different multi-criteria decision-making methods. Omega 96(102):254
Nallapati R, Zhai F, Zhou B (2017) Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of AAAI international conference
Nenkova A, Maskey S, Liu Y (2011) Automatic summarization. In: Proceedings of ACL
Page L, Brin S, Motwani R et al (1999) The pagerank citation ranking: Bringing order to the web. Tech. rep, Stanford InfoLab
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 701–710
Polsley S, Jhunjhunwala P, Huang R (2016) Casesummarizer: A system for automated summarization of legal texts. In: COLING
Rincy TN, Gupta R (2020) Ensemble learning techniques and its efficiency in machine learning: a survey. In: International conference on data, engineering and applications (IDEA), pp 1–6
Saravanan M, Ravindran B, Raman S (2006) Improving legal document summarization using graphical models. In: Proceedings of the 2006 conference on legal knowledge and information systems: JURIX 2006: the nineteenth annual conference. IOS Press, NLD, pp 51–60
Shukla A, Bhattacharya P, Poddar S, et al (2022) Legal case document summarization: extractive and abstractive methods and their evaluation. In: Proceedings of the conference of the Asia-Pacific chapter of the association for computational linguistics and the international joint conference on natural language processing (Volume 1: Long Papers), pp 1048–1064
Xu H, Savelka J, Ashley KD (2021) Toward summarizing case decisions via extracting argument issues, reasons, and conclusions. In: Proceedings of the international conference on artificial intelligence and law (ICAIL), pp 250–254
Yeh JY, Ke HR, Yang WP et al (2005) Text summarization using a trainable summarizer and latent semantic analysis. Inf Process Manage 41:75–95
Zhong L, Zhong Z, Zhao Z, et al (2019) Automatic summarization of legal decisions using iterative masking of predictive sentences. In: Proceedings of ICAIL
Acknowledgements
The authors acknowledge the anonymous reviewers whose comments greatly helped to improve the paper. The research is partially supported by the TCG Centres for Research and Education in Science and Technology (CREST), India through a project titled “Smart Legal Consultant: AI-based Legal Analytics”.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Deroy, A., Ghosh, K. & Ghosh, S. Ensemble methods for improving extractive summarization of legal case judgements. Artif Intell Law 32, 231–289 (2024). https://doi.org/10.1007/s10506-023-09349-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10506-023-09349-8