Artificial Intelligence and Law

, Volume 18, Issue 1, pp 45–76 | Cite as

Identification of Rhetorical Roles for Segmentation and Summarization of a Legal Judgment



Legal judgments are complex in nature and hence a brief summary of the judgment, known as a headnote, is generated by experts to enable quick perusal. Headnote generation is a time consuming process and there have been attempts made at automating the process. The difficulty in interpreting such automatically generated summaries is that they are not coherent and do not convey the relative relevance of the various components of the judgment. A legal judgment can be segmented into coherent chunks based on the rhetorical roles played by the sentences. In this paper, a comprehensive system is proposed for labeling sentences with their rhetorical roles and extracting structured head notes automatically from legal judgments. An annotated data set was created with the help of legal experts and used as training data. A machine learning technique, Conditional Random Field, is applied to perform document segmentation by identifying the rhetorical roles. The present work also describes the application of probabilistic models for the extraction of key sentences and composing the relevant chunks in the form of a headnote. The understanding of basic structures and distinct segments is shown to improve the final presentation of the summary. Moreover, by adding simple additional features the system can be extended to other legal sub-domains. The proposed system has been empirically evaluated and found to be highly effective on both the segmentation and summarization tasks. The final summary generated with underlying rhetorical roles improves the readability and efficiency of the system.


Document segmentation Rhetorical roles Conditional random field Document summarization K-mixture 


  1. Allan J, Carbonell J, Doddington G, Yamron Y, Yang Y (1998) Topic detection and tracking pilot study final report. In: Proceedings of the DARPA broadcast news transcription and understanding workshop, pp 194–218Google Scholar
  2. Beeferman D, Berger A, Lafferty J (1999) Statistical models for text segmentation. Mach Learn 34(1–3):177–210MATHCrossRefGoogle Scholar
  3. Bhatia VK (1999) Analyzing genre: language use in professional settings. Longman, LondonGoogle Scholar
  4. Borkar V, Deshmukh K, Sarawagi S (2001) Automatic segmentation of text into structured records. In: Proceedings of ACM SIGMOD 2001, Santa Barbara, pp 175–186Google Scholar
  5. Brandow R, Mitze K, Rau LF (1995) Automatic condensation of electronic publications by sentence selection. Inf Process Manag 31(5):675–685CrossRefGoogle Scholar
  6. Brunk C, Pazani M (1991) An investigation of noise-tolerant relational concept learning algorithms. In: Proceedings of the eighth international workshop on machine learning, Ithaca, pp 389–393Google Scholar
  7. Buckley A, Singhal A, Mitra A, Salton G (1996) New retrieval approaches using SMART. In: Proceedings of TREC-4, pp 25–48Google Scholar
  8. Carbonell J, Goldstein J (1998) The use of MMR, diversity-based re-ranking for reordering documents and producing summaries. In: SIGIR ‘98: proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, Melbourne, pp 335–336Google Scholar
  9. Christopher DM, Schütze H (2001) Foundations of statistical natural language processing. The MIT Press, LondonGoogle Scholar
  10. Church KW, Gale WA (1995) Poisson mixtures. Nat Lang Eng 1(2):163–190CrossRefGoogle Scholar
  11. Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3(1):261–283Google Scholar
  12. Cohen W (1995) Fast effective rule induction, in machine learning. In: Proceedings of the twelfth international conference, Morgan Kaufmann, Lake Tahoe, California, pp 335–342Google Scholar
  13. Cohen W, Singer Y (1999) A simple, fast, and effective rule learner. In: Proceedings of the sixteenth national conference on artificial intelligence (AAAI-99), AAAI Press, pp 335–342Google Scholar
  14. Edmundson HP (1969) New methods in automatic abstracting. J ACM 16(2):264–285MATHCrossRefGoogle Scholar
  15. Erkan G, Radev DR (2004a) Lexpagerank: prestige in multi-document text summarization. In: Lin D, Wu D (eds) Proceedings of EMNLP 2004, Association for Computational Linguistics, Barcelona, pp 365–371Google Scholar
  16. Erkan G, Radev DR (2004b) Lexpagerank: prestige in multi-document summarization. In: EMNLPGoogle Scholar
  17. Farzindar A (2005) Résumé automatique de textes juridiques. Ph.D. Thesis, Université de Montréal et Université Paris IV-SorbonneGoogle Scholar
  18. Farzindar A, Lapalme G (2004) Letsum, an automatic legal text summarization system. In: Gorden T (ed) Legal knowledge and information systems, JURIX 2004: the seventeenth annual conference, IOS Press, Amsterdam, pp 11–18Google Scholar
  19. Filatova E, Hatzivassiloglou V (2004) Event-based extractive summarization. In: ACL workshop text summarization branches outGoogle Scholar
  20. Freddy Y, Choi Y (2000) Advances in domain independent linear text segmentation. In: Proceedings of the first conference on North American chapter of the association for computational linguistics, vol 4, ACM International Conference Proceeding Series, pp 26–33Google Scholar
  21. Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121:256–285MATHCrossRefMathSciNetGoogle Scholar
  22. Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning (ICML-96), Bari, pp 148–156Google Scholar
  23. Friedmen JH, Popescu BE (2005) Predictive learning via rule ensembles. Technical Report, Stanford UniversityGoogle Scholar
  24. Furnkranz J, Widmer G (1994) Incremental reduced error pruning, machine learning. In: Proceedings of the eleventh international conference, New Brunswick, pp 70–77Google Scholar
  25. Grover C, Hachey B (2006) Extractive summarization of legal texts. Artif Intell Law 14(4):305–345 (Kluwer Academic Publishers, USA)Google Scholar
  26. Grover C, Hachey B, Hughson I (2004) The HOLJ Corpus: supporting summarization of legal texts. In: Proceedings of the 5th international workshop on linguistically interpreted corpora (CLIN’04), Geneva, pp 47–54Google Scholar
  27. Hajime M, Manabu O (2000) A comparison of summarization methods based on task-based evaluation. In: Proceedings of 2nd international conference on language resources and evaluation, LREC-2000, Greece, pp 633–639Google Scholar
  28. Hearst MA (1994) Multi-paragraph segmentation of expository text. In: Proceedings of the 32nd meeting of the association for computational linguistics, Las Cruces, pp 9–16Google Scholar
  29. Jing H, Barzilay R, Mckeown K, Elhadad M (1998) Summarization evaluation methods: experiments and analysis. Proceedings of AAAI 98 spring symposium on intelligent text summarization, pp 60–68Google Scholar
  30. Jones KS, Galliers JR (1995) Evaluating natural language processing review. Springer, New YorkCrossRefGoogle Scholar
  31. Katz SM (1995) Distribution of content words and phrases in text and language modeling. Nat Lang Eng 2(1):15–59CrossRefGoogle Scholar
  32. Kozima H (1993) Text segmentation based on similarity between words. In: Proceedings of the 31st annual meeting of the association for computational linguistics, Columbus, pp 286–288Google Scholar
  33. Krippendorff K (1980) Content analysis: an introduction to its methodologies. Sage publications, Beverly HillsGoogle Scholar
  34. Lafferty J, McCullam A, Pereira F (2001) Conditional random fields: probabilistic models and for segmenting and labeling sequence data. In: Proceedings of international conference machine learning, pp 282–289Google Scholar
  35. Li WJ, Xu W, Wu ML, Yuan CF, Lu Q (2006) Extractive summarization using inter- and intra-event relevance. In: Proceedings of the 21st international conference computational linguistics and 44th annual meeting of ACL (ACL/COLING’06), Sydney, July 17–21, pp 369–376Google Scholar
  36. Lin C (2004) ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the workshop on text summarization branches out (WAS 2004), Barcelona, pp 74–81Google Scholar
  37. Lin C, Hovy E (2003) Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of the human technology conference (HLTNAACL-2003), Edmonton, pp 62–69Google Scholar
  38. Luhn HP (1958) The automatic creation of literature abstracts. IBM J Res Dev 2(2):159–165CrossRefMathSciNetGoogle Scholar
  39. Mani I, House D, Klein G, Hirschman L, Orbsl L, Firmin T, Chrzanowski M, Sundheirm B (1998) The TIPSTER SUMMAC text summarization evaluation, MITRE Technical report, MTR98W0000138, The MITRE CorporationGoogle Scholar
  40. McCullam A, Freitag D, Pereira F (2000) Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of international conference machine learning, pp 591–598Google Scholar
  41. McDonald R (2007) A study of global inference algorithms in multi-document summarization. In: Proceedings of the 29th European conference on information retrieval (ECIR), pp 557–564Google Scholar
  42. Morris AH, Kasper GM, Adams GA (1992) The effects and limitations of automated text condensing on reading comprehension performance. Inf Syst Res 26:17–35CrossRefGoogle Scholar
  43. Nakao Y (2000) An algorithm for one-page summarization of a long text based on thematic hierarchy detection. In: Proceedings of the 26th annual meeting of the association for computational linguistics, New Jersey, pp 302–309Google Scholar
  44. Peng F, McCullam A (2006) Accurate information extraction from research papers using conditional random fields. Inf Process Manag 42(4):963–979CrossRefGoogle Scholar
  45. Quinlan JR (1994) C4.5: programs for machine learning, Morgan KaufmannGoogle Scholar
  46. Radev DR, Jing H, Budzikowska M (2000) Centroids-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In: Proceedings of ANLP-NAACL workshop on summarization, Seattle, Washington, pp 21–30Google Scholar
  47. Saravanan M, Ravindran B, Raman S (2006a) Improving legal document summarization using graphical models. In: Proceedings of 19th international annual conference on legal knowledge and information systems, JURIX 2006, Paris, pp 51–60Google Scholar
  48. Saravanan M, Raman S, Ravindran B (2006b) A probabilistic approach to multi-document summarization for generating a tiled summary. Int J Comput Intell Appl 6(2):231–243 (Imperial College)CrossRefGoogle Scholar
  49. Saravanan M, Ravindran B, Raman S (2008) Automatic identification of rhetorical roles using conditional random fields for legal document summarization. In: Proceedings of the third international joint conference on natural language processing, IJCNLP 2008, Hyderabad, pp 51–60Google Scholar
  50. Schapire RE, Singer Y (1998) Improved boosting algorithms using confidence-rated predictions. In: Proceedings of the eleventh annual conference on computational learning theory, New York, pp 80–91Google Scholar
  51. Siegal S, Castellan NJ (1988) Nonparametric statistics for the behavioral sciences. McGraw Hill, BerkeleyGoogle Scholar
  52. Sutton C, McCallum A (2005) Piecewise training for undirected models. In: Proceedings of the 21st conference on uncertainty in artificial intelligence (UAI-05), Arlington, pp 568–575Google Scholar
  53. Teufel S, Moens M (2002) Summarizing scientific articles—experiments with relevance and rhetorical status. Comput Linguist 28(4):409–445CrossRefGoogle Scholar
  54. Van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Butterworths, LondonGoogle Scholar
  55. Viterbi AJ (1967) Error bounds for convolution codes and asymptotically optimal decoding algorithm. IEEE Trans Inf Process 13:260–269MATHCrossRefGoogle Scholar
  56. Wallach HM (2004) Conditional random fields: an introduction. Technical Report MS-CIS-04-21, Department of CIS, University of PennsylvaniaGoogle Scholar
  57. Wiebe JM (1994) Tracking point of view in narrative. Comput Linguist 20(2):223–287Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIIT MadrasChennaiIndia

Personalised recommendations