Abstract
Automatic text summarization is a process of condensing the content of a text document to pursue the most important information. It plays a significant role in various tasks like text categorization, question answering and information retrieval (IR). As legal information retrieval (LIR) is a subfield of IR, the produced summaries are combined into IR system, with the objective of decreasing the length of the document. In this way, we can improve the access time for searching the information, and relevant documents are retrieved. In this article, we present the creation of passage-level summaries (generic and legal) with different compression ratios and evaluate their performance. The generic summaries present the overall description of the essential information of a document and legal summaries, produced by taking into account the domain-specific features that are present in the document. Next, we propose Boosting Okapi BM25 which is the modified model of Okapi BM25 to increase the efficiency of the LIR. We have evaluated proposed LIR approach in terms of MAP and R-precision and summarization approach using ROUGE tool on FIRE2013 and FIRE2014 datasets. To show the efficacy of the proposed system, we compare the experimental results with different IR models like PL2, \(In\_expB2\), \(In\_expC2\), InL2, \(DFR\_BM25\), Okapi BM25 in terms of MAP. The experimental results of the proposed system show better performance than the existing various IR models in terms of various performance metrics. The empirical results also exhibit that the integration of text summarization and IR techniques helps in retrieving relevant information with less access time.
Similar content being viewed by others
Notes
References
Maxwell, K.T.; Schafer, B.: Concept and context in legal information retrieval. In: JURIX, pp. 63–72 (2008)
Blair, D.C.; Maron, M.E.: An evaluation of retrieval effectiveness for a full-text document-retrieval system. Commun. ACM 28(3), 289–299 (1985)
Saravanan, M.; Ravindran, B.; Raman, S.: Improving legal information retrieval using an ontological framework. Artif. Intell. Law 17(2), 101–124 (2009)
Zajic, D.M.; Dorr, B.J.; Lin, J.: Single-document and multi-document summarization techniques for email threads using sentence compression. Inf. Process. Manag. 44(4), 1600–1610 (2008)
Fattah, M.A.; Ren, F.: GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput. Speech Lang. 23(1), 126–144 (2009)
Wang, D.; Li, T.: Document update summarization using incremental hierarchical clustering. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 279–288. ACM (2010)
Kutlu, M.; Cıǧır, C.; Cicekli, I.: Generic text summarization for Turkish. Comput. J. 53(8), 1315–1323 (2010)
Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, vol. 8. Barcelona, Spain (2004)
Schweighofer, E.: The revolution in legal information retrieval or: the empire strikes back’, 1999 (1). J. Inf. Law Technol. (JILT) pp. 1–99 (1999)
Smith, J.; Gelbart, D.; MacCrimmon, K.; Atherton, B.; McClean, J.; Shinehoft, M.; Quintana, L.: Artificial intelligence and legal discourse: the flexlaw legal text management system. Artif. Intell. Law 3(1), 55–95 (1995)
Dillon, M.: Introduction to modern information retrieval: G. Salton and M. Mcgill. McGraw-Hill, New York. xv+ 448 pp., \(32.95 ISBN 0-07-054484-0\) (1983)
Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of. Addison-Wesley, Reading (1989)
Monroy, A.L.; Calvo, H.; Gelbukh, A.; Pacheco, G.G.: Link analysis for representing and retrieving legal information. In: Computational Linguistics and Intelligent Text Processing, pp. 380–393. Springer (2013)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM (JACM) 46(5), 604–632 (1999)
Hoque, M.M.; Poudyal, P.; Goncalves, T.; Quaresma, P.: Information retrieval based on extraction of domain specific significant keywords and other relevant phrases from a conceptual semantic network structure. In: Proceedings of the 5th 2013 Forum on Information Retrieval Evaluation. FIRE (2013)
Kanapala, A.; Pal, S.: Ism@fire-2013 information access in the legal domain. In: Proceedings of the 5th 2013 Forum on Information Retrieval Evaluation. ACM (2013)
Kim, M.Y.; Xu, Y.; Goebel, R.; Satoh, K.: Answering yes/no questions in legal bar exams. In: JSAI International Symposium on Artificial Intelligence, pp. 199–213. Springer (2013)
Kim, M.Y.; Xu, Y.; Goebel, R.: A convolutional neural network in legal question answering. In: Ninth International Workshop on Juris-Informatics (JURISIN) (2015)
Kim, M.Y.; Xu, Y.; Lu, Y.; Goebel, R.: Legal question answering using paraphrasing and entailment analysis. In: Tenth International Workshop on Juris-Informatics (JURISIN) (2016)
Carvalho, D.S.; Tran, D.V.; Tran, V.K.; Minh, L.N.: Improving legal information retrieval by distributional composition with term order probabilities. arXiv preprint arXiv:1706.01038 (2017)
Rosso, P.; Correa, S.; Buscaldi, D.: Passage retrieval in legal texts. J. Log. Algebraic Program. 80(3–5), 139–153 (2011)
Nanda, R.; Adebayo, K.J.; Di Caro, L.; Boella, G.; Robaldo, L.: Legal information retrieval using topic clustering and neural networks. In: COLIEE@ ICAIL, pp. 68–78 (2017)
Jung, H.; Lee, Y.; Kim, W.: Legal information retrieval system relevant to R&D projects based on word-embedding of core terms. In: Proceedings of the International Conference on Electronic Commerce, p. 13. ACM (2017)
Sugathadasa, K.; Ayesha, B.; de Silva, N.; Perera, A.S.; Jayawardana, V.; Lakmal, D.; Perera, M.: Legal document retrieval using document vector embeddings and deep learning. In: Science and Information Conference, pp. 160–175. Springer (2018)
Saravanan, M.; Ravindran, B.; Raman, S.: Improving legal document summarization using graphical models. Front. Artif. Intell. Appl. 152, 51 (2006)
Chieze, E.; Farzindar, A.; Lapalme, G.: An automatic system for summarization and information extraction of legal information. Semant. Process. Legal Texts 6036, 216–234 (2010)
Farzindar, A.; Lapalme, G.: Legal text summarization by exploration of the thematic structures and argumentative roles. In: Text Summarization Branches Out Workshop held in conjunction with ACL, pp. 27–34. Citeseer (2004)
Hachey, B.; Grover, C.: Extractive summarisation of legal texts. Artif. Intell. Law 14(4), 305–345 (2006)
Galgani, F.; Compton, P.; Hoffmann, A.: Combining different summarization techniques for legal text. In: Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, pp. 115–123. Association for Computational Linguistics (2012)
Compton, P.; Jansen, R.: Knowledge in Context: A Strategy for Expert System Maintenance. Springer, Berlin (1990)
Kim, M.Y.; Xu, Y.; Goebel, R.: Summarization of legal texts with high cohesion and automatic compression rate. In: JSAI International Symposium on Artificial Intelligence, pp. 190–204. Springer (2012)
Galgani, F.; Compton, P.; Hoffmann, A.: Hauss: Incrementally building a summarizer combining multiple techniques. Int. J. Hum. Comput. Stud. 72(7), 584–605 (2014)
Polsley, S.; Jhunjhunwala, P.; Huang, R.: Casesummarizer: A system for automated summarization of legal texts. In: COLING (Demos), pp. 258–262 (2016)
Kanapala, A.; Pal, S.; Pamula, R.: Text summarization from legal documents: a survey. Artif Intell Rev 51, 1–32 (2017)
de Vargas Feijó, D.; Moreira, V.P.: Rulingbr: A summarization dataset for legal texts. In: International Conference on Computational Processing of the Portuguese Language, pp. 255–264. Springer (2018)
Elnaggar, A.; Gebendorfer, C.; Glaser, I.; Matthes, F.: Multi-task deep learning for legal document translation, summarization and multi-label classification. In: Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference, pp. 9–15. ACM (2018)
Bhattacharya, P.; Hiware, K.; Rajgaria, S.; Pochhi, N.; Ghosh, K.; Ghosh, S.: A comparative study of summarization algorithms applied to legal case judgments. In: European Conference on Information Retrieval, pp. 413–428. Springer (2019)
Yamada, H.; Teufel, S.; Tokunaga, T.: Building a corpus of legal argumentation in japanese judgement documents: towards structure-based summarisation. Artif. Intell. Law 27, 1–30 (2019)
Kan, M.Y.; Klavans, J.L.: Using librarian techniques in automatic text summarization for information retrieval. In: Proceedings of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 36–45. ACM (2002)
Ker, S.J.; Chen, J.N.: A text categorization based on summarization technique. In: Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics-Volume 11, pp. 79–83. Association for Computational Linguistics (2000)
Mori, T.; Nozawa, M.; Asada, Y.: Multi-answer-focused multi-document summarization using a question–answering engine. ACM Trans. Asian Lang. Inf. Process. (TALIP) 4(3), 305–320 (2005)
Kanapala, A.; Pal, S.: Test collection for legal ir from online discussion forums. In: Proceedings of the Forum for Information Retrieval Evaluation, pp. 126–129. ACM (2014)
Salton, G.; McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1986)
Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguist. 22(2), 249–254 (1996)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kanapala, A., Jannu, S. & Pamula, R. Passage-Based Text Summarization for Legal Information Retrieval. Arab J Sci Eng 44, 9159–9169 (2019). https://doi.org/10.1007/s13369-019-03998-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-019-03998-1