Skip to main content
Log in

Passage-Based Text Summarization for Legal Information Retrieval

  • Research Article - Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Automatic text summarization is a process of condensing the content of a text document to pursue the most important information. It plays a significant role in various tasks like text categorization, question answering and information retrieval (IR). As legal information retrieval (LIR) is a subfield of IR, the produced summaries are combined into IR system, with the objective of decreasing the length of the document. In this way, we can improve the access time for searching the information, and relevant documents are retrieved. In this article, we present the creation of passage-level summaries (generic and legal) with different compression ratios and evaluate their performance. The generic summaries present the overall description of the essential information of a document and legal summaries, produced by taking into account the domain-specific features that are present in the document. Next, we propose Boosting Okapi BM25 which is the modified model of Okapi BM25 to increase the efficiency of the LIR. We have evaluated proposed LIR approach in terms of MAP and R-precision and summarization approach using ROUGE tool on FIRE2013 and FIRE2014 datasets. To show the efficacy of the proposed system, we compare the experimental results with different IR models like PL2, \(In\_expB2\), \(In\_expC2\), InL2, \(DFR\_BM25\), Okapi BM25 in terms of MAP. The experimental results of the proposed system show better performance than the existing various IR models in terms of various performance metrics. The empirical results also exhibit that the integration of text summarization and IR techniques helps in retrieving relevant information with less access time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.isical.ac.in/~fire/2013/legal.html.

  2. http://www.isical.ac.in/~fire/2014/legal.html.

  3. http://www.isical.ac.in/~fire/.

  4. http://research.nii.ac.jp/ntcir/index-en.html.

  5. http://lucene.apache.org/.

  6. https://lawyerslaw.org/legaldictionary-download-a-free-law-dictionary-software-with-2500-legal-terms/.

  7. https://www.oxfordlearnersdictionaries.com/wordlist/english/oxford3000/.

  8. http://terrier.org/docs/current/configure_retrieval.html.

References

  1. Maxwell, K.T.; Schafer, B.: Concept and context in legal information retrieval. In: JURIX, pp. 63–72 (2008)

  2. Blair, D.C.; Maron, M.E.: An evaluation of retrieval effectiveness for a full-text document-retrieval system. Commun. ACM 28(3), 289–299 (1985)

    Article  Google Scholar 

  3. Saravanan, M.; Ravindran, B.; Raman, S.: Improving legal information retrieval using an ontological framework. Artif. Intell. Law 17(2), 101–124 (2009)

    Article  Google Scholar 

  4. Zajic, D.M.; Dorr, B.J.; Lin, J.: Single-document and multi-document summarization techniques for email threads using sentence compression. Inf. Process. Manag. 44(4), 1600–1610 (2008)

    Article  Google Scholar 

  5. Fattah, M.A.; Ren, F.: GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput. Speech Lang. 23(1), 126–144 (2009)

    Article  Google Scholar 

  6. Wang, D.; Li, T.: Document update summarization using incremental hierarchical clustering. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 279–288. ACM (2010)

  7. Kutlu, M.; Cıǧır, C.; Cicekli, I.: Generic text summarization for Turkish. Comput. J. 53(8), 1315–1323 (2010)

    Article  Google Scholar 

  8. Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, vol. 8. Barcelona, Spain (2004)

  9. Schweighofer, E.: The revolution in legal information retrieval or: the empire strikes back’, 1999 (1). J. Inf. Law Technol. (JILT) pp. 1–99 (1999)

  10. Smith, J.; Gelbart, D.; MacCrimmon, K.; Atherton, B.; McClean, J.; Shinehoft, M.; Quintana, L.: Artificial intelligence and legal discourse: the flexlaw legal text management system. Artif. Intell. Law 3(1), 55–95 (1995)

    Article  Google Scholar 

  11. Dillon, M.: Introduction to modern information retrieval: G. Salton and M. Mcgill. McGraw-Hill, New York. xv+ 448 pp., \(32.95 ISBN 0-07-054484-0\) (1983)

  12. Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of. Addison-Wesley, Reading (1989)

    Google Scholar 

  13. Monroy, A.L.; Calvo, H.; Gelbukh, A.; Pacheco, G.G.: Link analysis for representing and retrieving legal information. In: Computational Linguistics and Intelligent Text Processing, pp. 380–393. Springer (2013)

  14. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM (JACM) 46(5), 604–632 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  15. Hoque, M.M.; Poudyal, P.; Goncalves, T.; Quaresma, P.: Information retrieval based on extraction of domain specific significant keywords and other relevant phrases from a conceptual semantic network structure. In: Proceedings of the 5th 2013 Forum on Information Retrieval Evaluation. FIRE (2013)

  16. Kanapala, A.; Pal, S.: Ism@fire-2013 information access in the legal domain. In: Proceedings of the 5th 2013 Forum on Information Retrieval Evaluation. ACM (2013)

  17. Kim, M.Y.; Xu, Y.; Goebel, R.; Satoh, K.: Answering yes/no questions in legal bar exams. In: JSAI International Symposium on Artificial Intelligence, pp. 199–213. Springer (2013)

  18. Kim, M.Y.; Xu, Y.; Goebel, R.: A convolutional neural network in legal question answering. In: Ninth International Workshop on Juris-Informatics (JURISIN) (2015)

  19. Kim, M.Y.; Xu, Y.; Lu, Y.; Goebel, R.: Legal question answering using paraphrasing and entailment analysis. In: Tenth International Workshop on Juris-Informatics (JURISIN) (2016)

  20. Carvalho, D.S.; Tran, D.V.; Tran, V.K.; Minh, L.N.: Improving legal information retrieval by distributional composition with term order probabilities. arXiv preprint arXiv:1706.01038 (2017)

  21. Rosso, P.; Correa, S.; Buscaldi, D.: Passage retrieval in legal texts. J. Log. Algebraic Program. 80(3–5), 139–153 (2011)

    Article  MATH  Google Scholar 

  22. Nanda, R.; Adebayo, K.J.; Di Caro, L.; Boella, G.; Robaldo, L.: Legal information retrieval using topic clustering and neural networks. In: COLIEE@ ICAIL, pp. 68–78 (2017)

  23. Jung, H.; Lee, Y.; Kim, W.: Legal information retrieval system relevant to R&D projects based on word-embedding of core terms. In: Proceedings of the International Conference on Electronic Commerce, p. 13. ACM (2017)

  24. Sugathadasa, K.; Ayesha, B.; de Silva, N.; Perera, A.S.; Jayawardana, V.; Lakmal, D.; Perera, M.: Legal document retrieval using document vector embeddings and deep learning. In: Science and Information Conference, pp. 160–175. Springer (2018)

  25. Saravanan, M.; Ravindran, B.; Raman, S.: Improving legal document summarization using graphical models. Front. Artif. Intell. Appl. 152, 51 (2006)

    Google Scholar 

  26. Chieze, E.; Farzindar, A.; Lapalme, G.: An automatic system for summarization and information extraction of legal information. Semant. Process. Legal Texts 6036, 216–234 (2010)

    Article  Google Scholar 

  27. Farzindar, A.; Lapalme, G.: Legal text summarization by exploration of the thematic structures and argumentative roles. In: Text Summarization Branches Out Workshop held in conjunction with ACL, pp. 27–34. Citeseer (2004)

  28. Hachey, B.; Grover, C.: Extractive summarisation of legal texts. Artif. Intell. Law 14(4), 305–345 (2006)

    Article  Google Scholar 

  29. Galgani, F.; Compton, P.; Hoffmann, A.: Combining different summarization techniques for legal text. In: Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, pp. 115–123. Association for Computational Linguistics (2012)

  30. Compton, P.; Jansen, R.: Knowledge in Context: A Strategy for Expert System Maintenance. Springer, Berlin (1990)

    Google Scholar 

  31. Kim, M.Y.; Xu, Y.; Goebel, R.: Summarization of legal texts with high cohesion and automatic compression rate. In: JSAI International Symposium on Artificial Intelligence, pp. 190–204. Springer (2012)

  32. Galgani, F.; Compton, P.; Hoffmann, A.: Hauss: Incrementally building a summarizer combining multiple techniques. Int. J. Hum. Comput. Stud. 72(7), 584–605 (2014)

    Article  Google Scholar 

  33. Polsley, S.; Jhunjhunwala, P.; Huang, R.: Casesummarizer: A system for automated summarization of legal texts. In: COLING (Demos), pp. 258–262 (2016)

  34. Kanapala, A.; Pal, S.; Pamula, R.: Text summarization from legal documents: a survey. Artif Intell Rev 51, 1–32 (2017)

    Google Scholar 

  35. de Vargas Feijó, D.; Moreira, V.P.: Rulingbr: A summarization dataset for legal texts. In: International Conference on Computational Processing of the Portuguese Language, pp. 255–264. Springer (2018)

  36. Elnaggar, A.; Gebendorfer, C.; Glaser, I.; Matthes, F.: Multi-task deep learning for legal document translation, summarization and multi-label classification. In: Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference, pp. 9–15. ACM (2018)

  37. Bhattacharya, P.; Hiware, K.; Rajgaria, S.; Pochhi, N.; Ghosh, K.; Ghosh, S.: A comparative study of summarization algorithms applied to legal case judgments. In: European Conference on Information Retrieval, pp. 413–428. Springer (2019)

  38. Yamada, H.; Teufel, S.; Tokunaga, T.: Building a corpus of legal argumentation in japanese judgement documents: towards structure-based summarisation. Artif. Intell. Law 27, 1–30 (2019)

    Article  Google Scholar 

  39. Kan, M.Y.; Klavans, J.L.: Using librarian techniques in automatic text summarization for information retrieval. In: Proceedings of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 36–45. ACM (2002)

  40. Ker, S.J.; Chen, J.N.: A text categorization based on summarization technique. In: Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics-Volume 11, pp. 79–83. Association for Computational Linguistics (2000)

  41. Mori, T.; Nozawa, M.; Asada, Y.: Multi-answer-focused multi-document summarization using a question–answering engine. ACM Trans. Asian Lang. Inf. Process. (TALIP) 4(3), 305–320 (2005)

    Article  Google Scholar 

  42. Kanapala, A.; Pal, S.: Test collection for legal ir from online discussion forums. In: Proceedings of the Forum for Information Retrieval Evaluation, pp. 126–129. ACM (2014)

  43. Salton, G.; McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1986)

    MATH  Google Scholar 

  44. Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguist. 22(2), 249–254 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ambedkar Kanapala.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kanapala, A., Jannu, S. & Pamula, R. Passage-Based Text Summarization for Legal Information Retrieval. Arab J Sci Eng 44, 9159–9169 (2019). https://doi.org/10.1007/s13369-019-03998-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-019-03998-1

Keywords

Navigation